Spark includes two useful functions to list databases and tables:
Both of those are using catalog API in Spark, and run for extremely long time, sometimes minutes (!) as they try to fetch all the possible metadata for all the objects. However, if you only need basic metadata, like database names and table names you can use Spark SQL:
show databases show tables from db_name
which return almost instantly. Apparently, to use from Python/Scala just wrap it in
spark.sql("show databases") will return a DataFrame with the info you require.
Em, excuse me! Have Android 📱 and use Databricks? You might be interested in my totally free (and ad-free) Pocket Bricks . You can get it from Google Play too:
To contact me, send an email anytime or leave a comment below.