Cybersecurity

Securing GenAI models with a rigorous cybersecurity defense


De-risking the AI adoption journey

Firms need to urgently address the risks arising from the adoption of GenAI models.

These risks can be new and, in some cases, unique. The key considerations that may help the enterprise to develop a robust approach to prevent these risks include the following:

  • Ensure model robustness, data integrity, and data protection

The cybersecurity team must focus on improving data integrity by ensuring that the data used to train GenAI models is free from tampering and represents a diverse and accurate dataset. This prevents or reduces the impact of data poisoning attacks.

Further, there should be security architecture reviews and threat modeling to assess model design for robustness and inherent resistance to attacks. Adversarial training should be incorporated so that the model is exposed to simulated attack scenarios during its training phase.

  • Mitigate direct and indirect prompt injection

The cybersecurity team can defend against direct prompt injection attacks by ensuring that models are trained using curated datasets and reinforcement learning. Prompt instruction and formatting techniques can guide model responses to potential threats and can also be used to identify adversarial inputs when combined with detection methods.

Indirect prompt injection can be defended against by using reinforcement learning from human feedback (RLHF) and filtering retrieved inputs to remove harmful instructions. Other techniques that can be leveraged are an LLM moderator to detect sophisticated attacks and interpretability-based solutions for outlier detection.

  • Mitigating API and software supply chain risks

The cybersecurity team needs to manage potential risks across the software supply chain through assurance practices like regular vulnerability scanning of model artifacts. Adopting safe model persistence formats is important and verifying cryptographic hashes in downloads maintains training data integrity and protects against manipulation.

  • Access control and observability

Strict access control measures are necessary to manage who can interact with AI systems. Observability needs to be built in as well, to allow AI models to be continuously monitored and tested in the production environment. Red teaming and vulnerability assessments can provide ongoing data inputs to feed back into AI security processes. Every LLM will change over time as it learns from newer datasets to keep outputs relevant and current, and this heightens the need for ongoing monitoring.

  • Transparency and explainability

New AI models cannot be black boxes. Enterprises must be able to explain a model’s decisions and outputs. Such transparency can also help identify when an attack alters model behavior.

LLM outcomes should be validated through human intervention to reduce the probability of hallucination and increase their accuracy. Extending enterprise security controls and security processes can ensure adversarial attack detection and response, while also helping to regulate and track the use of LLM solutions. 



Source

Related Articles

Back to top button