RetentiveGuard

RetentiveGuard is an innovative machine learning model designed to accurately detect whether an essay was written by a student or generated by a Large Language Model (LLM). This project leverages advanced natural language processing (NLP) and deep learning techniques to differentiate between human-written and LLM-generated essays, ensuring the authenticity and integrity of academic submissions.




Objectives

The primary goal of RetentiveGuard is to enhance academic integrity by providing educators and institutions with a tool to verify the originality of student essays. The system aims to:

Dataset

The project uses a unique dataset composed of essays written in response to one of seven essay prompts:

Methodology

The project follows a structured approach to ensure accurate and reliable results:

  1. Data Acquisition: Collect essays written by students and those generated by LLMs.
  2. Preprocessing Data: Clean and preprocess the data to make it suitable for model training.
  3. Model Configuration: Set up the machine learning model with appropriate parameters.
  4. Model Training: Train the model using the prepared training set.
  5. Model Inference: Apply the trained model to the test set to make predictions.
  6. Post Inference Processing: Analyze the results and refine the model as necessary.
  7. Output: Generate a final report on model performance and accuracy.

How to Use RetentiveGuard

Follow these steps to set up and use the RetentiveGuard model:

  1. Clone the Repository:
     git clone https://github.com/yash-raj202134/RetentiveGuard.git
    
  2. Install Dependencies:
    • Use Conda to create and activate an environment:
        conda create -n retentiveguard python=3.10 -y
        conda activate retentiveguard
      
    • Install project requirements:
        pip install -r requirements.txt
      
  3. Prepare Data:
    • Ensure the training and test sets are available in the specified directory.
    • The directory structure and file format should follow the guidelines provided in the project documentation.
  4. Run the Model:
    • Use the provided scripts to preprocess data, train the model, and run inference:
        python preprocess_data.py
        python train_model.py
        python run_inference.py
      
  5. Evaluate Results:
    • Review the output and analyze the model’s performance.

Future Directions

The development of RetentiveGuard is an ongoing process, with future enhancements aimed at:

Technologies Used

Conclusion

RetentiveGuard represents a significant advancement in ensuring the authenticity of written content in an era where AI-generated text is increasingly prevalent. By combining cutting-edge NLP techniques with rigorous training and evaluation processes, RetentiveGuard provides a reliable and effective solution for distinguishing between human and AI-generated essays.

Contributing

We welcome contributions from the community! If you would like to contribute, please follow these steps:

  1. Fork the repository.
  2. Create a new branch with your changes.
  3. Make a pull request to the main branch.

For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Contact

Feel free to reach out with any questions or feedback:

Acknowledgments


Happy Coding!