I am trying to create a python utility that will take dataset from vertex ai datasets and will generate statistics for that dataset. But I am unable to check the dataset using jupyter notebook. Is there any way out for this?
If I understand correctly, you want to use Vertex AI dataset inside Jupyter Notebook
. I don't think that this is currently possible. You are able to export Vertex AI
datasets to Google Cloud Storage
in JSONL format:
Your dataset will be exported as a list of text items in JSONL format. Each row contains a Cloud Storage path, any label(s) assigned to that item, and a flag that indicates whether that item is in the training, validation, or test set.
At this moment, you can use BigQuery
data inside Notebook
using %%bigquery
like it's mentioned in Visualizing BigQuery data in a Jupyter notebook. or use csv_read()
from machine directory or GCS
like it's showed in the How to read csv file in Google Cloud Platform jupyter notebook thread.
However, you can fill a Feature Request
in Google Issue Tracker to add the possibility to use VertexAI
dataset directly in the Jupyter Notebook
which will be considered by the Google Vertex AI Team
.