| Attacks | Category | Description |
| PII Leakage Focused Attacks | ||
| Autocompletion Attack | Black Box | Exploits the LLM model’s completion function by repeatedly submitting minimal prompts and requesting the generation of corresponding outputs, potentially leading to the disclosure of PII contained within the fine-tuning data. |
| Extraction Attack | Black Box | Aims to extract sensitive information or training data embedded within LLMs by interacting directly with the models, generating queries, and receiving responses to reconstruct a dataset resembling the original training data. |
| Memorization Focused Attacks | ||
| Self-calibrated Probabilistic Variation—Membership Inference Attack | Black-Box | Variant of MIA that compares the probability distributions of a target model and a reference model to infer membership, utilizing a self-prompt approach to construct a reference dataset internally. |
| Neighborhood Attack | Black Box | Variant of MIA that generates augmented neighbor samples for a target text using a Masked Language Model (MLM) and compares the loss scores of the target text and its neighbors to infer membership. |
| LiRA-Candidate | Black Box | Variant of MIA that compares the confidence (negative log-likelihood) of predictions made by a target model and a reference model on a given text to infer membership. |
| LiRA-Base | Black Box | Variant of MIA that compares the confidence (negative log-likelihood) of predictions made by a target model and a base model used as a reference model on a given text to infer membership. |