Algerian Legal GraphRAG Assistant
Bilingual (FR/AR) legal AI for Algerian law. GraphRAG with hybrid retrieval: vector + BM25 + knowledge graph traversal over the full legal corpus.
System Architecture & Overview
The Algerian Legal GraphRAG Assistant was developed as the primary course project for the NLP module at ENSIA, ultimately securing the #1 rank in the academic cohort with a grade of 18.3/20.
Algerian jurisprudence is complex, highly structured, and written bilingualism (French and Arabic) introduces significant cross-lingual retrieval gaps. Conventional vector-only RAG pipelines struggle to follow relational links between different legal codes, amendments, and executive decrees.
To solve this, we architected a hybrid retrieval pipeline that blends semantic vector embeddings (representing local meaning) with a structured knowledge graph that preserves logical cross-references, hierarchy, and legal dependencies between documents.
Key Deliverables & Capabilities
- Bilingual Search: Dual-index semantic vector mapping for seamless Arabic and French queries.
- Hybrid Retrieval: Combined FAISS vector similarities, lexical BM25 matching, and structured NetworkX knowledge graph traversals.
- Logical Document Splicing: Custom recursive chunking designed around traditional article, chapter, and section boundaries rather than raw character counts.
- Entity Relation Parsing: Automatic extraction of legal references ('Article X refers to Decree Y') to continuously map new laws into the graph.
Critical Challenge & Pivot
Structuring the bilingual Arabic-French knowledge graph was exceptionally hard due to spelling variations and complex relational syntax. We solved this by implementing a customized Arabic NLP preprocessing pipeline utilizing CAMeL Tools and regularized morphological parsers.
System Benchmarks & Outcomes
Ranked #1 in the ENSIA academic cohort with a score of 18.3/20. The system demonstrated a ~40% reduction in document retrieval latency and outperformed classic dense vector baseline models by >15% in response accuracy and answer relevance.
Engineering Stack
Chosen as the high-performance backend routing framework for its asynchronous native execution and rapid serialization.
Utilized to model, build, and run complex path-traversal algorithms across the legal relational graph database.
Deployed for lightning-fast, high-dimensional vector search to execute dense semantic retrievals on Arabic and French document chunks.
Orchestrated the modular agent logic, handling retrieval-augmented generation and abstract prompt chains seamlessly.
Served as the primary generative foundation layer, providing strong multilingual legal reasoning capabilities under strict context structures.