Search code examples
pythonapache-sparkhivepysparkbeeline

Showing tables from specific database with Pyspark and Hive


Having some databases and tables in them in Hive instance. I'd like to show tables for some specific database (let's say 3_db).

+------------------+--+
|  database_name   |
+------------------+--+
| 1_db             |
| 2_db             |
| 3_db             |
+------------------+--+

If I enter beeline from bash-nothing complex there, I just do the following:

show databases;
show tables from 3_db;

When I'm using pyspark via ipython notebeook- my cheap tricks are not working there and give me error on the second line (show tables from 3_db) instead:

sqlContext.sql('show databases').show()
sqlContext.sql('show tables from 3_db').show()

What seems to be wrong and why's the same code works in one place and don't work in another?


Solution

  • sqlContext.sql("show tables in 3_db").show()