Artificial intelligence initiatives, and particularly large language models (LLMs), are moving from research labs into production systems at unprecedented speed. Organisations are embedding them in customer service chatbots, developer tools, content pipelines, and even critical decision-making processes. For technical teams, this shift brings both opportunity and risk.

Many vulnerabilities are already been reported, which includes but not restricted to,

Manipulated prompts
Unintended data leaks
Poisoned supply chains
And runaway compute costs

‍

Unlike traditional application flaws, these vulnerabilities are often subtle and difficult to detect without a clear framework.

That is why the OWASP Top 10 for LLM Applications (2025) matters. It provides a standard and framework for engineers, security teams, and decision-makers to evaluate risks specific to AI systems. This blog unpacks each of those risks, connects them to business impacts, and highlights how modern security practices can help manage them.

Why Security & Development Teams Need to Care

Boards and executives increasingly see artificial intelligence as a competitive advantage. But it is the engineers, developers, and security practitioners who shoulder the responsibility of making these systems safe and sustainable.

The OWASP Top 10 for LLMs surfaces risks that fall outside the boundaries of traditional application security models. Prompt injection, model poisoning, and system prompt leakage look different from SQL injection or cross-site scripting, yet their business consequences can be just as severe.

For technical professionals, ignoring these risks is not an option. Executives expect teams to continue integrating LLMs into existing workflows, without compromising productivity or resilience. Yet, a poorly configured model can expose sensitive data, increase compliance liabilities, or become a new entry point for attackers.

Through its Top 10 for LLM Applications project, OWASP provides a framework that helps translate AI intricacies into clear security priorities in a language that developers, IT, and security teams can act on.

The OWASP Top 10 for LLMs: a Quick Technical Breakdown

Let’s first define each of the 10 risks and explain how they manifest in LLM-enabled systems. Immediately after, we’ll show their business impacts and mitigation strategies in a comparison table.

LLM01: 2025 Prompt Injection

Prompt injection arises when an attacker supplies crafted text that overrides or alters an LLM’s intended instructions. Because the model processes all input as contextual guidance, hostile prompts can subvert normal behaviour, expose hidden instructions, or trigger actions beyond the system’s design.

Mechanics

This occurs when unvalidated user input or external content is passed into the model without safeguards. Once injected, the malicious text can hijack outputs or force the model to disclose information that’s supposed to be confidential.

Likely attack vectors include:

Malicious queries entered into a public-facing chatbot to override its instructions.
Hostile instructions embedded in documents retrieved by a RAG pipeline.
Poisoned content from third-party data sources that is ingested and executed as trusted context.

‍

LLM02: 2025 Sensitive Information Disclosure

Sensitive information disclosure occurs when an LLM unintentionally exposes personal data, business secrets, or proprietary model details through its outputs. This risk spans both the data fed into the model (training or runtime) and the application context in which it operates.

Mechanics

Disclosure happens when inputs aren’t adequately sanitised, when training data includes confidential material, or when adversarial prompts trick the model into bypassing safeguards. These leaks undermine privacy, compliance, and intellectual property requirements.

Likely attack vectors include:

An LLM returns another user’s personal information due to weak data sanitisation.
An attacker bypasses filters and extracts sensitive details from the model’s context.
Confidential information included in training is reproduced in outputs, exposing sensitive business or customer data.

‍

LLM03: 2025 Supply Chain Risks

Supply chain risks emerge when the components, tools, or services that support an LLM become compromised. Because LLM applications rely heavily on external dependencies—pre-trained models, third-party APIs, libraries, datasets, and plug-ins—any weakness in this chain can introduce malicious behaviour or vulnerabilities downstream.

Mechanics

These risks occur when organisations adopt external models or dependencies without sufficient validation, integrity checks, or monitoring. A compromised model, library, or plug-in can silently inject malicious code, alter outputs, or expose systems to broader compromise.

Likely attack vectors include:

A malicious third-party library used in preprocessing data introduces backdoors into the LLM pipeline.
A compromised pre-trained model from an external source propagates hidden vulnerabilities into a production system.
An attacker publishes a fake model under a trusted name, embedding malware and backdoors.

‍

LLM04: 2025 Data & Model Poisoning

Data and model poisoning occur when malicious actors tamper with datasets or fine-tuning processes to embed backdoors, biases, or harmful behaviours into an LLM. These manipulations compromise model integrity, degrading accuracy, fairness, and reliability while opening avenues for exploitation.

Mechanics

Poisoning can happen at multiple stages of the model lifecycle: pre-training, fine-tuning, or embedding. Attackers insert toxic or falsified data into the training corpus or tweak parameters so the model behaves normally under most conditions but misfires when a hidden trigger is activated.

Likely attack vectors include:

Manipulating training data so the model spreads misinformation or biased outputs.
Injecting falsified documents during training, leading the model to produce inaccurate results.
Inserting a backdoor trigger that allows authentication bypass, hidden command execution, or data exfiltration.

‍

LLM05: 2025 Improper Output Handling

Improper output handling happens when applications consume LLM responses without validation or sanitisation. Because model outputs can contain untrusted data, failing to treat them carefully may open the door to injection attacks, information leaks, or unsafe automation.

Mechanics

This risk arises when developers assume LLM outputs are inherently safe and use them directly in applications, logs, or downstream systems. Attackers exploit this trust by crafting inputs that lead to outputs containing malicious code or instructions, which are then executed without checks.

Likely attack vectors include:

An LLM generates SQL commands that are executed directly, leading to injection vulnerabilities.
A model outputs unsanitised HTML/JavaScript that is rendered in a web application, causing XSS.
Generated code suggestions with insecure functions are used in production, introducing exploitable flaws.

‍

LLM06: 2025 Excessive Agency

Excessive agency risks occur when AI agents powered by LLMs are given too much functionality, autonomy, or control over external systems without safeguards. When connected to tools, plug-ins, or APIs, these agents may perform unintended or harmful operations.

Mechanics

This vulnerability emerges when developers grant agents broad permissions or execution rights without proper restrictions. Because outputs from the model are treated as trusted instructions, a malicious or manipulated input can trigger unsafe actions that users never intended.

Likely attack vectors include:

An agent with system-level access deletes critical files after receiving a crafted prompt.
An autonomous agent executes financial transactions without user approval.
An agent controlling IoT or industrial systems carries out dangerous commands due to manipulated input.

‍

LLM07: 2025 System Prompt Leakage

System prompt leakage happens when hidden or internal instructions given to an LLM are exposed to users or attackers. These system prompts often contain sensitive configuration details, operational logic, or security controls that shape how the model behaves.

Mechanics

This vulnerability arises when models fail to sufficiently mask or protect their underlying system prompts. Attackers may extract these prompts directly through crafted queries or indirectly when the model reveals them as part of its responses. Once leaked, adversaries can reverse-engineer protections or manipulate model behaviour.

Likely attack vectors include:

A system prompt containing credentials for an integrated tool is leaked, allowing attackers to misuse those credentials.
An attacker extracts a system prompt that prohibits offensive content, links, and code execution, then bypasses those controls with a prompt injection to achieve remote code execution.
Internal developer guidance embedded in the system prompt is disclosed, enabling adversaries to manipulate safeguards.

‍

LLM08: 2025 Vector & Embedding Weaknesses

Vector and embedding weaknesses occur when the representation of data in high-dimensional vector spaces is exploited. Because embeddings are often used for semantic search, retrieval-augmented generation (RAG), or clustering, attackers can manipulate them to bypass controls or retrieve unintended information.

Mechanics

The risk arises when embedding models or vector databases fail to enforce validation or filtering. Malicious inputs crafted to appear similar in vector space can confuse systems, leading to data leakage or inappropriate associations.

Likely attack vectors include:

A resume hides malicious instructions in white-on-white text. When processed into embeddings by a RAG-based screening system, the LLM recommends an unqualified candidate.
In a multi-tenant environment, embeddings from one group are inadvertently retrieved by another group’s LLM queries, leaking sensitive business information.
Poisoned content embedded into a knowledge base causes the LLM to retrieve and act on manipulated information.

‍

LLM09: 2025 Misinformation

Misinformation occurs when an LLM generates inaccurate, fabricated, or misleading outputs that are treated as factual. While not always the result of malicious action, such “hallucinations” can still lead to serious consequences when embedded in critical workflows. Attackers may also exploit this weakness to seed false information.

Mechanics

The risk arises when outputs from LLMs are consumed without fact-checking, validation, or guardrails. Misinformation can propagate through applications, reports, or code suggestions, leading to unsafe or unreliable outcomes.

Likely attack vectors include:

A coding assistant hallucinates a package name. Attackers publish a malicious library under that name, and developers install it, leading to backdoors.
A healthcare chatbot provides unsafe medical advice, resulting in patient harm and legal liability for the provider.
An educational LLM generates fabricated citations, misleading students and researchers who rely on its outputs.

‍

LLM10: 2025 Unbounded Consumption

Unbounded consumption occurs when an LLM processes excessive or uncontrolled inputs, leading to runaway use of computational or financial resources. Because inference is costly, this weakness exposes systems to denial of service, economic drain, or even intellectual property theft through large-scale model replication.

Mechanics

The issue arises when applications allow unlimited or unvalidated queries. Attackers can flood systems with oversized inputs, repeat requests at scale, or craft resource-heavy prompts. This not only disrupts service for legitimate users but also risks unsustainable cloud costs and exposure of proprietary models.

Likely attack vectors include:

An attacker submits an oversized input that consumes excessive memory, GPU, and cloud compute, slowing or crashing the service.
Repeated high-volume API requests hog resources, denying access to legitimate users, just like a DoS attack
Attackers generate synthetic training data via the LLM’s API, fine-tuning their own model to replicate its functionality.

Business Impact & Mitigation Priorities

To make these risks easier to translate into business terms, the table below maps each OWASP category to its potential organisational impact and the mitigation priorities technical teams should focus on.

Risk ID	Business Impact	Mitigation Priority
LLM01: Prompt Injection	Data leakage Manipulated outputs Potential system compromise via untrusted instructions	Validate and sanitise inputs Constrain model context Monitor for injection attempts
LLM02: Sensitive Information Disclosure	Breach of privacy Compliance violations (e.g., NZ Privacy Act 2020, HIPC, PCI DSS) Exposure of IP or customer data	Implement strict data sanitisation Restrict model access to sensitive context Add monitoring and redaction
LLM03: Supply Chain Risks	Introduction of backdoors, malware, or vulnerabilities via compromised models, APIs, or libraries	Vet and validate third-party models/services Apply integrity checks Maintain provenance tracking
LLM04: Data & Model Poisoning	Reduced accuracy Hidden backdoors Manipulated behaviours Long-term compromise of model trust	Control training data sources Validate fine-tuning inputs Monitor for anomalous behaviour
LLM05: Improper Output Handling	Injection vulnerabilities XSS Insecure code execution Business process compromise	Treat outputs as untrusted Apply sanitisation and validation Avoid direct execution of model responses
LLM06: Excessive Agency	Financial loss System damage Unsafe automation when AI agents act without limits	Apply least-privilege to agents Restrict tool/API access Enforce human-in-the-loop for critical actions
LLM07: System Prompt Leakage	Credential exposure Bypass of safety controls Reverse-engineering of protections	Mask or encrypt system prompts Separate sensitive data from prompts Detect and block extraction attempts
LLM08: Vector & Embedding Weaknesses	Data leakage across tenants Manipulated retrievals Poisoned knowledge bases	Validate embeddings Isolate tenants in vector databases Monitor for poisoned or adversarial entries
LLM09: Misinformation	Unsafe outputs Reputational damage Legal liability from false advice or fabricated content	Apply fact-checking pipelines Constrain LLM use in high-risk domains Require human review for critical outputs
LLM10: Unbounded Consumption	Denial of service Runaway compute costs Model replication leading to IP theft	Enforce rate limits and quotas Validate input sizes Monitor for large-scale extraction attempts

‍

How Blacklock Helps You Secure LLM-Powered Applications

The OWASP Top 10 for LLMs shows that these risks span training, runtime, and ongoing operations. Hence, AI system security initiatives require continuous validation, not one-off testing. Blacklock provides the automation, coverage, and developer-focused outputs needed to operationalise these practices.

To support technical teams, Blacklock delivers several capabilities that directly align with OWASP’s risk framework, such as:

Continuous penetration testing

Blacklock’s PTaaS platform combines automated scanning with manual testing across DAST, SAST, API, infrastructure, and SBOM coverage. This allows teams to detect risks like injection flaws (LLM01, LLM05) and supply chain issues (LLM03) as part of normal release cycles.

Automated Security Validation

Developers can instantly retest fixes using AI-driven validation agents. This shortens remediation cycles for risks such as sensitive information disclosures (LLM02) and poisoned data (LLM04), ensuring the vulnerabilities are actually resolved before production.

Vulnerability Kill Chain Analysis

Findings are mapped to the kill chain and ordered into prioritised remediation plans. This helps teams address the most critical exposures first, from excessive agent permissions (LLM06) to unbounded consumption (LLM10).

Developer workflow integration

Blacklock plugs directly into GitHub, GitLab, Jira, and Azure DevOps, embedding security into CI/CD pipelines. Vulnerabilities are triaged and tracked within existing workflows, reducing overhead and aligning with DevSecOps practices.

Would you like to explore how Blacklock can help your organisation put these practices into action? Contact us

Alternatively, you might prefer to get a 14-day free trial instead.

Share this post