Drilling into Big Data — Data Ingestion (4)

$ hadoop fs –ls  /user
$ hadoop fs –mkdir /user/demo

SQOOP

  • Import individual tables or entire databases to files in HDFS
  • Import from SQL databases directly into your Hive data warehouse
  • Sqoop import
  • Sqoop export
$ sudo cp ojdbc6.jar /usr/lib/sqoop-current/lib
$ sudo cp sqljdbc_6.0.81_enu.tar.gz /usr/lib/sqoop-current/lib

File Formats

Manage Parallelism

Sqoop Import

sqoop eval --connect jdbc:oracle:thin:@182.156.193.194:1556:ORCL --username xxx--password xxx --query "SELECT * FROM TRIP_ADVISOR LIMIT 3"
sqoop import --connect jdbc:oracle:thin:@182.156.193.194:1556:ORCL --username xxx    --P --table TRIP_ADVISOR --target-dir hdfs://emr-header-1.cluster-88549:9000/user/demo/sqoop  -m1

Troubleshooting Issues

sqoop import --connect jdbc:oracle:thin:@192.168.6.23:1526:xxx --username xxx --P –table xxx --target-dir hdfs://emr-header-1.cluster-88549:9000/user/demo/sqoop -m1 --driver oracle.jdbc.driver.OracleDriver
  • The Oracle service might not be running on the given host and port
  • The firewall might restrict the client access to the oracle server

Best Practices

  • Sqoop does not support few hadoop file formats like ORC, RC
  • Mentioning schema and table names in Capital letters will prevent facing some issues
  • Use split-by if you need multiple mappers
  • Avoid using column names which are keywords in sqoop
  • If a table does not have a primary key defined and the — split-by is not provided in the command, then import will fail unless the number of mappers is explicitly set to one

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Engineering Conversion Table Free Download

Best Exam Expo Site to Trust | Waec Expo Runz | Neco Runz Answers | THE HOME OF EXCELLENT RESULTS…

React File Downloads to Rails

Sharesies #1: Know Your Environment

Starry Sky program of subDAO launches on official website, Bounce Finance, Gate.io and MEXC global.

AWS Lambda Reserved Concurrency v/s Provisioned Concurrency Scaling

No matter where you are on the journey, in some way, you are continuing on

Our One Year Development Roadmap

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Entire CI/CD pipeline using Jenkins

Cloud System Architecture — Diagrams

Connect to IBM Cloud Foundry from your local

Cloud Migration Services | Cloud Migration Solutions — Ziffity