📄 Data Management

Note: Check out our Dashboard (opens in a new tab) for a streamlined data management interface.

Managing Data Sources

Add a Source

To add a new data source:

client.add_source(source_name='City')

Optionally, to add a data source with metadata support:

client.add_source(source_name='City', metadata_fields={"author":"str", "views":"int", "date":"date"})

metadata is a dict that specfies metadata fields and types of the data source. We currently suppport three metadata types: str (String), int (Integer), date (Date).

You can also specify the embedding model to use for the data source:

client.add_source(source_name='City', embedding_model='bge-base-en-v1.5')

We currenctly supoort the following embeddings models:

  • bge-base-en-v1.5 from BAAI
  • text-embedding-3-large, text-embedding-3-small and text-embedding-ada-002 from OpenAI
  • voyage-large-2 and voyage-2from Voyage AI
  • embed-english-v3.0 and embed-multilingual-v3.0 from Cohere

Delete a Source

To remove a data source:

client.delete_source(source_name='City')

List Sources

To get a list of available data sources:

sources = client.list_sources()

Get Source Info

To get useful meta information of a data source (e.g., embedding model and token limit):

restuls = client.get_source_info(source_name='City')

Set Default Embedding Model for a Source

To set or change the default embedding model for a data source:

client.set_source_embedding_model(source_name="City", embedding_model="text-embedding-3-large")

Managing Files

Add a File

To add a file to a specific source:

client.add_file(source_name='City', local_path='/Users/John/Boston.txt')

local_path specifies the local path to the file you wish to upload. source_name specifies the data source to store the uploaded file. Supported file types are text, PDF and Word files.

Optionally, you can also add a file with metadata:

client.add_file(source_name='City', local_path='/Users/John/Boston.txt', 
                metadata={'author':'John', 'views':4500, 'date':'2022-08-01'}))

metadata is a dict that specfies metadata fields and values associated with the file.

Delete a File

To remove a file from a particular source:

client.delete_file(source_name='City', file_name='Boston.txt')

file_name is the name of the file to be deleted. source_name specifies the source of the file to delete.

Download a File

To download a file from a designated source:

client.download_file(source_name='City', file_name='Boston.txt', local_path='/Users/John/Boston.txt')

file_name is the name of the file to download. source_name specifies the source of the file. local_path specifies the local path to save the downloaded file.

Upsert File Metadata

To insert or update metadata for a specific file:

client.upsert_file_metadata(source_name='City', file_name='Boston.txt', 
                           metadata={'author':'Michael', 'views':6000, 'date':'2023-02-15'})

List Files

To get a list of available files in a specific source:

files = client.list_files(source_name='City')

Get File Retrieval Status

To check if a specific file from a source is ready for retrieval:

status = client.get_file_retrieval_status(source_name='City', file_name='Boston.txt')

Get Chunks of a File

To get the chunks of a certain file from a data srouce:

chunks = client.get_file_chunks(source_name='City', file_name='Boston.txt')