AI-Powered Legal Intelligence — Accessible, Accurate & Affordable
This research presents an AI-driven framework leveraging Small Language Models (SLMs) integrated with Retrieval-Augmented Generation (RAG) and agentic architectures to democratize legal knowledge in Sri Lanka. The system spans four specialized domains: Labour & Employment law guidance, Property & Family law advisory, Criminal case outcome prediction, and intelligent Deed document verification.
A comprehensive exploration of legal AI systems tailored for the Sri Lankan legal context, combining SLMs, RAG frameworks, and agentic workflows.
The legal system plays a critical role in maintaining justice, fairness, and social order. In Sri Lanka, legal knowledge is confined to professionals or documented in complex texts, creating significant barriers for ordinary citizens. Labour disputes, property transfers, family disputes, and criminal litigation all require specialized understanding that most citizens lack.
Recent advancements in AI, particularly Natural Language Processing (NLP), offer promising solutions. The digitization of court judgments and legal documents has created opportunities for computational analysis. Research in Legal NLP has evolved from keyword-based systems to sophisticated transformer-based models like BERT, LEGAL-BERT, and domain-specific LLMs.
However, most existing systems — LawLLM (US), LawGPT (China), Swiss-BERT variants — are jurisdiction-specific and computationally expensive, limiting their applicability to Sri Lanka's unique legal ecosystem, which blends Roman-Dutch law, English common law, and customary traditions.
LLM-based approaches demonstrate strong reasoning but suffer from jurisdictional overfitting, high computational cost, and hallucination risks. They are not directly transferable to Sri Lanka.
RAG-based systems improve factual grounding but often lack structured output generation, validation mechanisms, and user-friendly interfaces essential for non-expert users.
Small Language Models offer a compelling balance — lower computational overhead, efficient fine-tuning via LoRA/QLoRA, and strong domain adaptation capabilities when trained on curated legal datasets.
| Feature | Quick Check | LawRec | Legal Query RAG | Our System |
|---|---|---|---|---|
| Transformer Models | No | Yes (BERT) | Yes | Yes |
| RAG Integration | No | No | Yes | Yes |
| Sri Lankan Focus | No | No | No | Yes |
| Natural Language Queries | Partial | Partial | Yes | Yes |
| Structured Legal Output | No | No | Partial | Yes |
| Scalability | Medium | Medium | Medium | High |
Despite significant advances in legal AI globally, critical gaps remain for the Sri Lankan context:
Most Legal NLP research concentrates on the United States Supreme Court, European Court of Human Rights, Chinese criminal courts, and Swiss Federal Supreme Court. These systems benefit from well-digitized databases and large labeled datasets. Sri Lanka, with its hybrid Roman-Dutch and English common law system, represents a significantly underexplored jurisdiction with unique challenges: limited digitized data, multilingual content (Sinhala, Tamil, English), inconsistent document formats, and no standard benchmark datasets.
Despite the increasing need for efficient and accessible legal information systems, Sri Lanka currently lacks AI-driven legal frameworks that integrate modern NLP, transformer-based models, and Retrieval-Augmented Generation — specifically tailored for its unique legal domains.
This absence creates significant barriers to legal accessibility, reduces efficiency in legal research and decision-making, and contributes to inequality in access to legal knowledge among citizens and professionals. A survey of 40 participants (lawyers, law students, general public) revealed:
To develop specialized AI-based legal assistance systems for Sri Lankan law domains that provide reliable, context-aware, and structured legal guidance — combining fine-tuned Small Language Models with retrieval-augmented mechanisms grounded in authoritative legal sources.
All four research components follow a unified, multi-layered methodology that integrates legal data engineering, model adaptation, retrieval design, system integration, and rigorous evaluation. The Agile development framework enables iterative improvement with measurable artifacts at each stage.
Collection of legal materials from digital repositories, law books, and physical archives. OCR-based digitization of scanned documents with quality scoring. Multilingual handling (Sinhala, Tamil, English).
Cleaning, normalization, and JSONL formatting. Schema validation ensuring consistent instruction-context-output structure. Train/validation/test splitting with leakage prevention.
LoRA/QLoRA-based fine-tuning using Unsloth. Domain adaptation for Qwen3-8B (legal recommendation), LEGAL-BERT-SMALL (criminal prediction). Structured output alignment training.
FAISS index construction from legal document embeddings. Document-diverse reranking. Agentic RAG with LangGraph orchestration: classify → retrieve → grade → generate → validate.
FastAPI backend with modular microservices. React frontend. Multi-layer evaluation (model-level, retrieval-level, system-level). End-to-end testing and iterative refinement.
The research employs a carefully selected technology stack balancing capability, efficiency, and deployability.
SLM + RAG system accepting natural language queries, outputting structured legal recommendations with applicable Act, Section, Year, and analogous case scenarios.
IT22322326 — E. NiruththikaLEGAL-BERT-SMALL fine-tuned on 890 Sri Lankan criminal judgments (2021–2025) for multi-class outcome classification — convicted, acquitted, sentence reduced, etc.
IT22049322 — Abiramy.TAgentic RAG system providing step-by-step legal guidance for Property Law and Family Law — fine-tuned Qwen3-1.7B with 4,700+ structured JSONL entries.
IT22177032 — E.S. MathusiganMulti-agent template matching for 5 deed types (Sale, Gift, Mortgage, Power of Attorney, Testamentary). 99.13% classification accuracy with rule-based legal validation.
IT22030412 — A. ThuvaragaTrack the progression of our research through key assessment milestones and deliverables.
Select Assessment
Initial research proposal outlining problem statement, objectives, and planned approach
The project proposal established the foundational research framework for all four components. It defined the research problem — the lack of AI-driven legal systems tailored for Sri Lanka — and proposed an integrated approach combining Small Language Models with RAG architectures.
First progress evaluation demonstrating initial implementation and data preparation
The first progress presentation demonstrated the data pipeline, initial model experiments, and early system prototypes for all four research components.
Second evaluation showing system integration, testing results, and refined models
Demonstrated functional prototypes with integrated RAG pipelines, agent-based workflows, and initial evaluation metrics across all components.
Complete system evaluation, final report submission, and comprehensive demonstration
Final submission of all four research components with complete documentation, evaluation reports, and fully deployed web applications.
Oral defense and examination of the research work by panel
The research viva will involve a comprehensive oral examination by an academic panel evaluating the depth, validity, and significance of all four research components.
All research documents produced throughout the project lifecycle. Click download to access each document.
Formal project initiation document outlining scope, stakeholders, objectives, and governance structure for all four research components.
Comprehensive research proposal covering literature review, problem statement, research objectives, methodology, and feasibility analysis.
E. Niruththika's individual project proposal report for the Labour & Employment Law recommendation system.
Abiramy.T's individual project proposal report for the criminal case outcome prediction system.
E.S. Mathusigan's individual project proposal report for the property and family law guidance system.
A. Thuvaraga's individual project proposal report for the deed document verification agent.
E. Niruththika's final research report on the Labour and Employment Law Recommendation System using Qwen3-8B + RAG.
Abiramy.T's final research report on criminal judicial outcome prediction using LEGAL-BERT-SMALL on Sri Lankan High Court judgments.
E.S. Mathusigan's report on step-by-step legal guidance for Property and Family Law using Agentic RAG with Qwen3-1.7B.
A. Thuvaraga's report on the multi-agent deed template matching system achieving 99.13% classification accuracy.
Assessment check lists and progress tracking documents for all project milestones and deliverables.
Consolidated progress status document covering all four sub-projects with current development milestones and results summary.
Slide decks from all research presentations across the project lifecycle.
A dedicated research team from the Department of Information Technology, Sri Lanka Institute of Information Technology (SLIIT), working to make legal knowledge accessible to all Sri Lankans.
Supervisor
Department of Information Technology
Sri Lanka Institute of Information Technology
Co-Supervisor
Department of Information Technology
Sri Lanka Institute of Information Technology
B.Sc. (Hons) Information Technology
Research Focus: Labour & Employment Law Recommendation System — Fine-tuned Qwen3-8B with FAISS-based RAG for structured legal recommendations including Act, Section, and Year identification.
✉ it22322326@my.sliit.lk
B.Sc. (Hons) Information Technology
Research Focus: Criminal Case Outcome Prediction — LEGAL-BERT-SMALL fine-tuned on 890 Sri Lankan criminal judgments for 11-class judicial outcome classification (67% accuracy, 0.61 Macro F1).
✉ it22049322@my.sliit.lk
B.Sc. (Hons) Information Technology
Research Focus: Property & Family Law Step-by-Step Guidance — Qwen3-1.7B with Agentic RAG (LangGraph), 4,700+ JSONL training samples, three-backend comparative evaluation (SLM / RAG / Agentic RAG).
✉ it22177032@my.sliit.lk
B.Sc. (Hons) Information Technology
Research Focus: Deed Document Template Matching Agent — Multi-agent SLM system for 5 deed types (Sale, Gift, Mortgage, Power of Attorney, Testamentary). 99.13% classification accuracy with rule-based legal validation.
✉ it22030412@my.sliit.lkFor research enquiries, collaboration opportunities, or questions about our legal AI systems, please reach out through any of the following channels.
Sri Lanka Institute of Information Technology (SLIIT)
Department of Information Technology
cdap.sliit.lk
Dr. Prasanna Sumathipala — SLIIT
2025 / 2026 — Final Year Research Project