Why This Matters Now
Building AI-driven document agents is becoming increasingly common, but ensuring that these systems respect user permissions is crucial. Traditional authorization methods fall short in RAG systems, where documents are the unit of access and LLMs synthesize information across multiple documents. Recent incidents highlight the risks of inadequate authorization, making it essential to implement robust security measures now.
The Problem Is That AI Makes Authorization Harder
Traditional authorization in web applications is typically coarse-grained, focusing on roles and permissions at the endpoint level. However, this approach breaks down in RAG systems for several reasons:
Document-Level Access Control: In RAG systems, documents are the fundamental units of access. A single
/searchendpoint might retrieve data from thousands of documents, each with its own owner. Granting or denying access to the endpoint alone is insufficient; document-level permissions are necessary.Synthesis Across Documents: Even if you filter the retrieval results correctly, subtle bugs can allow unauthorized documents to enter the prompt context. The model might inadvertently include these documents in the final response, leading to data leaks.
The solution lies in integrating authorization deeply into the retrieval pipeline, treating it as a first-class concern rather than an afterthought.
Relationship-Based Access Control with Auth0 FGA
Auth0 FGA (Fine-Grained Authorization) addresses these challenges by modeling authorization as a graph of relationships between objects. Inspired by Zanzibar, Google’s globally distributed authorization system, FGA allows for complex and dynamic permission checks without requiring extensive application code.
Defining the Authorization Model
For our example application—a paycheck insights API—employees can query their own pay history, while managers can query the pay history of their teams. The authorization rules enforce these boundaries automatically at every layer of the stack.
Here’s how the authorization model is defined:
type user
relations:
define department: [department]
define manager: manager from department
type department
relations:
define manager: [user]
type paycheck
relations:
define can_view: owner or manager from owner
define owner: [user]
This model captures the organizational policy in a straightforward manner:
- A user can view a paycheck if they are the owner (the employee it belongs to).
- Alternatively, they can view it if they are a manager of the department that the owner belongs to.
Indirect Relationships
The key feature of this model is the indirect relationship:
manager from owner
This relationship automatically derives that a manager can view an employee’s paycheck without explicit checks in the application code. For instance, if Mary is set as the manager of the Developer Relations department and John is a member of that department, FGA automatically infers that Mary can view John’s paychecks.
Writing Tuples
When a paycheck is uploaded, a single tuple is written to the FGA store:
{
"user": "user:john",
"relation": "owner",
"object": "paycheck:abc123"
}
This tuple establishes that John owns the paycheck abc123. The authorization model then ensures that anyone with the appropriate relationship (either as an owner or a manager of John’s department) can view this paycheck.
Structured Extraction from Messy PDFs with LlamaParse
Another significant challenge in building AI document agents is extracting meaningful information from unstructured documents like PDFs. Paychecks, for example, are often formatted with tables, different fonts, and alignments, making naive parsing strategies ineffective.
LlamaParse is a document parsing service designed specifically for RAG pipelines. It handles:
- Table Extraction: Preserves row/column relationships in tables.
- Layout-Aware Parsing: Understands multi-column and multi-section documents.
- Markdown Output: Provides clean, structured markdown that LLMs can process directly.
- Async Processing: Allows the API to remain responsive by offloading parsing tasks.
Parsing Flow
The parsing process in our application involves the following steps:
Upload the Raw PDF: The PDF file is uploaded to LlamaCloud.
file = llama.files.create(file=(filename, content, "application/pdf"))Parse with LlamaParse: Initiates the parsing job and retrieves the structured markdown.
job = llama.parsing.parse(file_id=file.id) # Poll until the job is complete markdown = llama.parsing.result_markdown(job.id)
The result is clean, structured markdown that accurately represents the layout of the paycheck, including details like pay period, gross wages, deductions, and net pay.
Integrating LlamaIndex and Auth0 FGA
Combining LlamaIndex and Auth0 FGA allows us to build a secure and efficient paycheck insights API. Here’s how the integration works:
Setting Up LlamaIndex
LlamaIndex is used to create a retriever that fetches relevant documents based on natural-language queries. The retriever interacts with the FGA service to ensure that only authorized documents are retrieved.
from llama_index import VectorStoreIndex, SimpleDirectoryReader, GPTVectorStoreIndex
from llama_index.storage.storage_context import StorageContext
from llama_index.vector_stores import SimpleVectorStore
# Initialize vector store
vector_store = SimpleVectorStore()
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# Load documents
documents = SimpleDirectoryReader('data').load_data()
# Build index
index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)
Implementing Authorization Checks
Before retrieving documents, the application checks the user’s permissions using Auth0 FGA.
import requests
def check_access(user_id, paycheck_id):
url = f"https://fga-api-url/check"
payload = {
"user": f"user:{user_id}",
"relation": "can_view",
"object": f"paycheck:{paycheck_id}"
}
response = requests.post(url, json=payload)
return response.json().get("allowed", False)
Querying the Index
Once access is verified, the application queries the index to retrieve relevant information.
def get_paycheck_insights(user_id, query):
# Check access for each paycheck
paycheck_ids = get_paycheck_ids_for_user(user_id)
authorized_paychecks = [pid for pid in paycheck_ids if check_access(user_id, pid)]
# Retrieve insights
insights = []
for pid in authorized_paychecks:
query_engine = index.as_query_engine()
response = query_engine.query(query)
insights.append(response.response)
return insights
Key Takeaways
- Authorization as a First-Class Concern: Integrating authorization deeply into the retrieval pipeline ensures that only authorized documents are accessed and synthesized by the AI model.
- Dynamic Permission Checks: Auth0 FGA’s graph-based model allows for dynamic and indirect permission checks, reducing the complexity of application code.
- Efficient Document Parsing: LlamaParse provides robust and efficient parsing capabilities, ensuring that unstructured documents are converted into a format suitable for AI processing.
🎯 Key Takeaways
- Implement authorization deeply within the retrieval pipeline.
- Utilize Auth0 FGA for dynamic and indirect permission checks.
- Use LlamaParse for efficient and accurate document parsing.
Comparison Table
| Approach | Pros | Cons | Use When |
|---|---|---|---|
| Coarse-Grained Authorization | Simple to implement | Lacks document-level control | Basic applications |
| Fine-Grained Authorization with Auth0 FGA | Dynamic and flexible | More complex setup | Advanced applications with complex access requirements |
Quick Reference
📋 Quick Reference
check_access(user_id, paycheck_id)- Verifies if a user has access to a specific paycheck.get_paycheck_insights(user_id, query)- Retrieves insights from authorized paychecks based on a natural-language query.
Timeline
Initial release of LlamaIndex 0.5
Launch of Auth0 FGA beta
Integration of LlamaIndex and Auth0 FGA in the paycheck insights API
Mermaid Diagram
Terminal Output
Checklist
- Set up Auth0 FGA for relationship-based access control
- Integrate LlamaParse for document parsing
- Implement authorization checks in the retrieval pipeline
- Test the API with various user roles and scenarios
That’s it. Simple, secure, works.

