This article will guide you on how to connect Apache Drill with Hive.
PreRequisite:
1. Hadoop should be installed.
2. Hive should be installed
3. Apache Drill should be installed.
Implementation:
1. Goto the path where hive is installed and start the hive metastore service:
2. Start the Hive Metastore service.
3. You will observe the below screen once the hive metastore service is successfully implemented.
4. Start a new terminal.
5. Goto the path where Apache Drill is installed.
6. Start the Drill server in embeded mode
7. Once the drill server is started in embedded mode, you will observe the drill prompt:
7. Go to the Apache Drill browser at http:localhost:8047 . You will observe below screen:
8. Click on Storage and then Enable hive storage plugin. The plugin should appear now in enabled storage plugins as shown below.
9. Click on update against the hive storage plugin and update the value as shown in the screen
10. Go back to the terminal where you have apache drill server started i.e. step 7.
11. Change the schema to use hive.
12. Now you can run show tables to list all the hive tables;
12. You can run any query on the hive tables. Remember, the query will not invoke a MapReduce process.
Conclusion:
Hope this helps you to understand how to configure Apache Drill with Hive metastore interface to query hive tables directly.
Sourabh Jain
Big Data & Analytics Architect
PreRequisite:
1. Hadoop should be installed.
2. Hive should be installed
3. Apache Drill should be installed.
Implementation:
1. Goto the path where hive is installed and start the hive metastore service:
Goto Hive Home |
Start Hive Service |
3. You will observe the below screen once the hive metastore service is successfully implemented.
Successfully Start Hive Service |
4. Start a new terminal.
5. Goto the path where Apache Drill is installed.
Goto Drill Home |
6. Start the Drill server in embeded mode
Start Drill Embedded Mode |
7. Once the drill server is started in embedded mode, you will observe the drill prompt:
Apache Drill Shell |
7. Go to the Apache Drill browser at http:localhost:8047 . You will observe below screen:
Apache Drill Web UI |
8. Click on Storage and then Enable hive storage plugin. The plugin should appear now in enabled storage plugins as shown below.
Enable Hive Storage Plugin |
9. Click on update against the hive storage plugin and update the value as shown in the screen
Configure Hive Storage Plugin |
10. Go back to the terminal where you have apache drill server started i.e. step 7.
11. Change the schema to use hive.
Change Hive Schema |
12. Now you can run show tables to list all the hive tables;
List Hive Tables |
12. You can run any query on the hive tables. Remember, the query will not invoke a MapReduce process.
Execute Queries on Hive via Apache Drill |
Conclusion:
Hope this helps you to understand how to configure Apache Drill with Hive metastore interface to query hive tables directly.
Sourabh Jain
Big Data & Analytics Architect
No comments:
Post a Comment