Creating & Using Data Sources
This guide covers uploading and managing document-based data sources, then querying them for relevant AI context.
MindStudio allows you to create internal data sources directly within your projects. These data sources are ideal for uploading documents—like support guides or product manuals—that your AI agents can reference to generate accurate, contextual responses.
Types of Data Sources
MindStudio supports several types of data sources:
Integration data sources: External services like Google Docs or Sheets, brought in via integration blocks.
Internal databases: Custom backends or structured tables, supported via advanced connections.
Document-based project data sources: Files uploaded directly into your project’s "Data Sources" folder—this is the focus of this guide.
Example Use Case: Creating a Support Bot
To demonstrate how document-based data sources work, we'll create a support bot that answers questions about MindStudio using uploaded documentation.
Step 1: Add a User Input Block
Begin your AI agent with a user input block. This block captures the user's question and stores it in a variable, typically called query
.
Step 2: Upload Documentation
Navigate to the Data Sources section on the left-hand panel. Click the plus button to create a new data source:
Name it (e.g.,
Mind Studio Docs
)Add a description
Upload documents (up to 150 files, each ≤50MB)
Tip: Use a free PDF compression service if your documents are too large.
Step 3: Monitor the Upload
As the document uploads, it will be processed into a vector database:
You’ll see a word count and chunk count.
Review the extracted text to ensure formatting looks clean.
Check the chunk preview to understand how the document is split.
Use the index snippet to reference the full document, or query it with natural language.
Querying the Data Source
Step 4: Add a Query Data Source Block
Insert the Query Data Source block into your workflow:
Select your uploaded data source.
Set the output variable (e.g.,
query_result
)Use the
query
variable (from user input) to trigger the search.Optionally adjust the number of chunks retrieved (default is 3, max is 5).
Generating Contextual Responses
Step 5: Add a Generate Text Block
Use a Generate Text block to create your AI’s response:
<context>
{{query_result}}
</context>
Use the info above to answer the following question:
{{query}}
This setup ensures the AI receives relevant context before answering.
Step 6: Preview the Agent
Use the Draft Agent preview to test your support bot. As users ask questions, the system:
Queries the vectorized document.
Retrieves relevant text chunks.
Uses those chunks as context to generate an answer.
Advanced: Using the Entire Document
If your model has a large enough context window (e.g., Claude 3.5 Haiku supports 200k tokens), you can pass the entire document to the AI using the index snippet.
Caution: Passing full documents may reduce performance or make the AI less precise. Use only when full context is necessary.
Key Takeaways
Data sources in MindStudio allow AI agents to query long-form documents with natural language.
Use them to build agents like knowledge bases, support bots, or product Q&A tools.
Choose between querying small chunks for relevance or referencing full documents for completeness.
Always validate uploaded files by checking the extracted text for formatting issues.
Data sources are a powerful way to give your AI agents domain-specific expertise—using the same documentation your team already relies on.
Last updated