Sunday 22 January 2017

Sqoop Password Encryption

Sqoop Password Encryption ➽➽➽➣

Sqoop ==> A tool designed to easily IMPORT data from DBMS(Relational database management system) into Hadoop cluster and EXPORT data from hadoop cluster to RDBMS.





Sqoop has gained a lot of popularity amongst Hadoop ecosystem which made it one of the darling tools in the Bigdata industry.

Firstly, when we import or export data from database through Sqoop then it is mandatory to provide password either through command line interface or through a file placed on a secured server location.

Let's Discuss in detail the various options available for managing passwords in Sqoop.

The First :--PASSWORD

This is a very rudimentary acceptable way of passing a database password to Sqoop if security is not a major concern.

sqoop  import   --connect jdbc:mysql://localhost:3306/sqoop  \
--username root  \
--password root  \
--table student

Second: -P

One of the problems with the first option  --password is that it logs the password in shell history.

Luckily, Sqoop supports reading passwords via input stream (STDIN) with echo turned off which in turn prompt users to input the password via. command line interface.

sqoop import  --connect jdbc:mysql://localhost:3306/sqoop \
--username root  \
-P  \
--table student

The -P option will tell Sqoop to prompt for a password. This will allow you to pass your password through the standard input stream as below snippet.






Third: Passing password through a password file :

The Password file is stored in HDFS(Hadoop Distributed File System) not in local system.

This password file should have 400 (read+write -Only for owner) permission so that it is not exposed to other users.

Let's follow few steps to create a password file:


  • vi sqoop.pwd — Enter the password and save it.
  • Transfer the sqoop.pwd file to HDFS by the command below:

                             hadoop fs –put sqoop.pwd /sqoop.pwd

  • Change the ownership and permission of sqoop.pwd file for more security.
  • Remove the sqoop.pwd in current directory(non-hdfs)

Post which execute the below command:

sqoop import --connect jdbc:mysql://localhost:3306/sqoop \
--username root \
--password-file /sqoop.pwd \
--table student \
--target-dir  '/sqoop_output/student_test'

Fourth:  HADOOP CREDENTIAL PROVIDER API

In Hadoop 2.6, a fundamental feature was introduced – the Credential API.

The CredentialProvider API in Hadoop provides a layer of abstraction between application and password the way they are stored

Sqoop 1.4.5 and later versions support the credential API keystore.

Now let's follow below steps to generate an encrypted credential keystore.

Note: On prompt, enter the password that will be used to access the database.

usre@nn1:- $ hadoop credential create mydb.password.alias -provider jceks://hdfs/user/root/mysql.password.jceks
Enter password:
Enter password again:

After successful generation of keystore below message will be printed.

mydb.password.alias has been successfully created.
org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated.

Now you can make use of mydb.password.alias file in your sqoop command

Add Below Property in sqoop command and change the password.

-Dhadoop.security.credential.provider.path = jceks://hdfs/user/root/mysql.password.jceks

Then follow the below command:

sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/root/mysql.password.jceks
--connect jdbc:mysql://localhost:3306/sqoop \
--username sqoop \
--password-alias mydb.password.alias \
--table stdent \
--target-dir '/sqoop_output/student_test'

This way password is hidden inside jceks://hdfs/user/root/mysql.password.jceks
Note: Alias “mydb.password.alias” is used to provide the password to the sqoop command to connect to the source database.

36 comments:

  1. Very helpful.Thanks for sharing

    ReplyDelete
  2. Thanks for Explaining. It's very helpful.

    ReplyDelete
  3. Good Manoj. I liked the way you presented

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Congratulations Manoj!
    Making India proud by sharing technology blogs to the World.
    Happy Republic day and keep it up!

    ReplyDelete
  6. Good Manoj. I liked the way you presented

    ReplyDelete
  7. Nicely explained Manoj..!

    Keep writing such helpful articles.

    ReplyDelete
  8. Nice explanation Manoj. It's really helpful.

    ReplyDelete
  9. Good Article Manoj, Keep it coming!

    ReplyDelete
  10. Can you put a demo or video demonstrating the same.

    ReplyDelete
    Replies
    1. Hi There,
      A detail video related to Sqoop and password Encryption is on its way. Stay tuned.

      Delete
  11. Hi Manoj,
    Nicely explained! Looking forward more such articles on Sqoop.
    Also very much keen to see the demo of the same.

    ReplyDelete
  12. Excellent this is the time if you take more initiative's like these kind of things, your profile will be more elevated and people like us can learn more things

    ReplyDelete
  13. Very well written. Keep on posting 😊

    ReplyDelete
  14. Really Good blog post.provided a helpful information.I hope that you will post more updates like this Big Data Hadoop Online Training Bangalore

    ReplyDelete
  15. Very useful..worked in first attempt.

    ReplyDelete