mirror of
https://github.com/Hxnxe/CyberSentinel-AI.git
synced 2025-11-04 17:13:53 +00:00
178 lines
14 KiB
Markdown
178 lines
14 KiB
Markdown
# 🔥 CyberSentinel AI - Automated Security Monitoring and AI Analysis System
|
|
|
|
[](LICENSE)
|
|
|
|
CyberSentinel AI is an **automated security monitoring and AI analysis system** designed to **track the latest security vulnerabilities (CVEs)** and **security-related repositories on GitHub** in real-time. It leverages **Artificial Intelligence (AI) technology** for in-depth analysis and **automatically publishes valuable security intelligence to a blog platform**.
|
|
|
|
## 🚀 Key Features
|
|
|
|
* **Multi-Source Data Monitoring**:
|
|
* **CVE Monitoring**: Real-time scraping of the latest CVE-related information from GitHub, enabling rapid discovery and tracking of the latest vulnerability trends.
|
|
* **GitHub Repository Monitoring**: Comprehensive monitoring of security-related open-source projects on GitHub through keyword searches and predefined watch lists.
|
|
* **Intelligent AI Analysis**:
|
|
* **OpenAI & Gemini Dual Engine**: Integrated with OpenAI and Gemini AI models, providing powerful natural language processing capabilities for in-depth security data analysis.
|
|
* **Multi-Dimensional Security Assessment**: Evaluation of CVEs and repositories from multiple dimensions, including **vulnerability principles**, **exploitation methods**, **risk levels**, and **impact scope**, ensuring depth and breadth of analysis.
|
|
* **Value Judgment and Filtering**: Intelligent AI-driven judgment of security information value, automatically filtering out low-value information and focusing on truly noteworthy security threats and technologies.
|
|
* **Automated Workflow**:
|
|
* **Fully Automated Monitoring**: The system runs **24/7** unattended, automating security information collection, analysis, and report generation.
|
|
* **Daily Security Briefing**: Generates daily security briefing reports on a schedule, summarizing the latest CVE vulnerabilities and GitHub security repository dynamics, and pushing them to a blog platform.
|
|
* **Dynamic Blacklisting**: Automatically updates blacklists based on AI analysis results, reducing interference from invalid information and improving monitoring efficiency.
|
|
* **Flexible Configuration and Management**:
|
|
* **Multi-GitHub Token Support**: Supports configuration of multiple GitHub Tokens, intelligently rotating usage to effectively avoid API rate limits.
|
|
* **Configurable Monitoring Parameters**: Keywords, watch repository lists, blacklists, etc., can be flexibly adjusted through configuration files to meet different monitoring needs.
|
|
* **Detailed Logging**: Detailed logs are recorded for all critical steps of system operation, facilitating troubleshooting and system monitoring.
|
|
* **Automated Blog Publishing**:
|
|
* **Integrated Blog Platform**: Integrated with a blog platform API to automatically publish daily security briefing reports, quickly sharing security intelligence.
|
|
* **Markdown Reports**: Analysis results and security briefings are generated in **Markdown format**, making them easy to read and edit.
|
|
|
|
## 🛠️ Technical Implementation
|
|
|
|
### 1. Monitoring Modules (Monitors)
|
|
|
|
* **`cve_monitor.py`**: **CVE Monitor**
|
|
* **GitHub API Interaction**: Uses the GitHub API to search for CVE-related repositories, keyword `CVE-202+`, and sorts by `updated` time.
|
|
* **CVE Information Extraction**: Extracts CVE numbers from repository names and descriptions using regular expressions.
|
|
* **Repository Information Crawling**: Retrieves repository descriptions, star counts, update times, recent commits, and other information.
|
|
* **Blacklist Filtering**: Supports **user blacklists** and **repository blacklists** to filter out invalid information sources.
|
|
* **File Content Analysis**: Clones repositories locally and **intelligently analyzes** **README.md** and other **high-priority files**, calculates file **relevance scores**, and initially filters high-value repositories.
|
|
* **Intelligent Token Management**: Implements automatic rotation and status checking of GitHub Tokens, dynamically switching available tokens to ensure the continuity of monitoring tasks.
|
|
* **Database Storage**: Uses the **SQLite** database **`database/cve_record.db`** to store CVE records, including CVE numbers, descriptions, publication dates, last modified dates, repository URLs, and other information.
|
|
|
|
* **`github_monitor.py`**: **GitHub Repository Monitor**
|
|
* **Keyword Search**: Periodically searches GitHub repositories based on the `GITHUB_KEYWORDS` list defined in the configuration file **`config.py`**.
|
|
* **Watch List**: Supports the `WATCHED_REPOSITORIES` list in the configuration file **`config.py`** to **focus monitoring** on predefined security repositories.
|
|
* **Repository Information Crawling**: Retrieves detailed repository information, including descriptions, star counts, last update times, recent commit records, and more.
|
|
* **Commit Record Analysis**: Crawls the **recent commit records** of repositories, **intelligently analyzes** commit information and file changes, and initially judges the **security relevance** of repositories.
|
|
* **Blacklist Filtering**: Supports **user blacklists** and **repository blacklists** to filter out invalid information sources.
|
|
* **Intelligent Token Management**: Shares the Token management mechanism with the CVE monitor.
|
|
* **Database Storage**: Uses the **SQLite** database **`database/github_repo.db`** to store GitHub repository records, including repository names, URLs, descriptions, last update times, star counts, whether they are high-value repositories, and other information.
|
|
|
|
### 2. AI Analysis Module (AI)
|
|
|
|
* **`analyzer.py`**: **AI Analyzer**
|
|
* **OpenAI & Gemini API**: Integrates **OpenAI API** (primary) and **Gemini API** (backup), supports **multi-model** switching, such as `gpt-4o-mini-2024-07-18` (fallback model).
|
|
* **Prompt Engineering**: Designed **different Prompt templates** for **different analysis scenarios** (CVE analysis, new repository analysis, repository update analysis, specific watch repository analysis) to optimize AI analysis results.
|
|
* **JSON Format Output**: Requires AI to **strictly output analysis results in JSON format** for easy program parsing and data processing.
|
|
* **Multi-Dimensional Security Analysis**: AI analysis results include rich information such as **brief descriptions of vulnerabilities/repositories**, **detailed summaries**, **risk levels**, **key points**, **technical details**, **affected components**, **value assessments**, **security types**, **update types**, and **vulnerability exploitation status**.
|
|
* **Result Validation and Standardization**: Performs **strict format validation** and **content standardization** on the JSON results returned by AI to ensure the accuracy and usability of the data.
|
|
* **Dynamic Blacklist Update**: Based on AI analysis results, **automatically judges** whether repositories or users should be added to the blacklist and **dynamically updates the blacklist file**.
|
|
* **Analysis Result Persistence**: **Saves** AI analysis results as **JSON files** and **updates** corresponding records in the **database**.
|
|
* **Article Title Classification**: Supports **AI classification** of security article titles for generating security briefing reports.
|
|
* **API Failover**: When OpenAI API calls fail, **automatically switches** to **backup OpenAI API** or **Gemini API** to improve system **stability and availability**.
|
|
|
|
### 3. Data Processing and Management (Utils)
|
|
|
|
* **`logger.py`**: **Logger**
|
|
* Uses the `logging` module to provide **complete logging** functionality, covering **DEBUG**, **INFO**, **WARNING**, **ERROR**, and other levels.
|
|
* Log information is **detailed** and **structured**, making it easy to troubleshoot and monitor the system.
|
|
* Logs are output to the file **`logs/security_monitor.log`** and rolled over daily.
|
|
|
|
* **`csv_writer.py`**: **CSV Result Writer** (currently not used, can be extended)
|
|
* Provides the function of **exporting analysis results to CSV files** for easy data analysis and sharing.
|
|
|
|
* **`article_fetcher.py`**: **Article Fetcher**
|
|
* **Multi-Source Fetching**: Currently supports fetching security articles from **BruceFeIix** and **D洞见 (doonsec)** WeChat official accounts.
|
|
* **Article Title and URL Extraction**: Uses regular expressions to extract article titles and URLs from web page content.
|
|
* **Retry Mechanism**: Uses a **backoff strategy** retry mechanism to improve the **stability and success rate** of article fetching.
|
|
* **Article Title Cleaning**: **Standardizes** and **cleans** article titles, removing redundant markers and formats.
|
|
|
|
* **`article_manager.py`**: **Article Manager**
|
|
* **Article De-duplication**: **Automatically filters** processed article URLs to avoid duplicate analysis and pushing.
|
|
* **AI Classification Result Processing**: Processes AI article title classification results and **organizes article lists by category**.
|
|
* **Daily Security Briefing Report Generation**: **Regularly** generates **Markdown format** daily security briefing reports, summarizing the latest security articles and AI analysis results.
|
|
* **Automated Blog Publishing**: Calls the `blog_manager.py` module to **automatically publish daily security briefing reports to a blog platform**.
|
|
* **Article Data Persistence**: **Saves** processed URLs and classified articles as **JSON files** for easy subsequent use and management.
|
|
|
|
* **`blog_manager.py`**: **Blog Manager**
|
|
* **Blog Platform API Interaction**: Encapsulates **common functions** for interacting with blog platform APIs, such as **creating articles** and **updating articles**.
|
|
* **Article ID Mapping Management**: **Records** the **article IDs** of daily security briefing reports on the blog platform for easy subsequent updates and management.
|
|
* **Automated Blog Publishing**: Implements the function of **automatically publishing daily security briefing reports to a blog platform**.
|
|
|
|
### 4. Database (Database)
|
|
|
|
* **`database/models.py`**: **Database Model Definition**
|
|
* Uses **SQLAlchemy** to define two data models, **`CVERecord`** (CVE record) and **`Repository`** (GitHub repository record), to facilitate data storage and querying.
|
|
* The database uses **SQLite**, and the file paths are **`database/cve_record.db`** and **`database/github_repo.db`**.
|
|
|
|
### 5. Configuration File (Config)
|
|
|
|
* **`config.py`**: **System Configuration File**
|
|
* Centrally manages the system's **configuration parameters**, such as database paths, API keys, monitoring intervals, keyword lists, blacklists, etc.
|
|
* Facilitates users to **customize** and **adjust** system behavior.
|
|
* Includes the following main configuration items:
|
|
* `DATABASE_PATH`: Database file path
|
|
* `MONITOR_INTERVAL`: Monitoring cycle interval (seconds)
|
|
* `GITHUB_TOKEN`: GitHub API Token (supports list `GITHUB_TOKENS`)
|
|
* `GITHUB_KEYWORDS`: List of GitHub repository search keywords
|
|
* `WATCHED_REPOSITORIES`: List of GitHub repositories to focus on monitoring
|
|
* `BLACKLIST_USERS`: User blacklist
|
|
* `BLACKLIST_REPOSITORIES`: Repository blacklist
|
|
* `PRIMARY_AI_CONFIG`: Primary AI service (OpenAI) configuration
|
|
* `BACKUP_AI_CONFIGS`: List of backup AI service (OpenAI) configurations
|
|
* `GEMINI_AI_CONFIG`: Gemini AI service configuration
|
|
* `BLOG_TOKEN`: Blog platform API Token
|
|
|
|
### 6. Main Program (Main)
|
|
|
|
* **`main.py`**: **System Main Program**
|
|
* **Initializes** each module (monitors, AI analyzer, article manager, etc.).
|
|
* **Starts** the monitoring cycle, **regularly** executing CVE monitoring, GitHub repository monitoring, AI analysis, article crawling, and blog publishing tasks.
|
|
* Uses **multi-threading** to achieve **concurrent monitoring** and **AI analysis**, improving system efficiency.
|
|
* **Exception handling** and **retry mechanisms** ensure stable system operation.
|
|
* **Status Monitoring Thread**: **Regularly checks** system operating status and records logs.
|
|
* **Daily Blog Publishing**: **Regularly** automatically publishes daily security briefing reports to a blog platform.
|
|
* **Command-Line Startup**: Users can start and stop the monitoring system via the command line.
|
|
|
|
## ⚙️ Running Environment
|
|
|
|
* Python 3.8+
|
|
* Dependencies (see `requirements.txt`)
|
|
|
|
## 📦 Installation Steps
|
|
|
|
1. **Clone the code repository**
|
|
|
|
```bash
|
|
git clone [Project Repository Address]
|
|
cd [Project Directory]
|
|
```
|
|
|
|
2. **Install dependencies**
|
|
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. **Configure the `config.py` file**
|
|
|
|
* Configure GitHub API Token (`GITHUB_TOKEN` or `GITHUB_TOKENS`)
|
|
* Configure OpenAI API key and Base URL (`PRIMARY_AI_CONFIG`, `BACKUP_AI_CONFIGS`)
|
|
* Configure Gemini API key and Base URL (`GEMINI_AI_CONFIG`)
|
|
* Configure Blog platform API Token (`BLOG_TOKEN`) (if you need to automatically publish to a blog)
|
|
* Modify other configuration items as needed, such as monitoring interval, keyword list, blacklist, etc.
|
|
|
|
4. **Run the system**
|
|
|
|
```bash
|
|
python main.py
|
|
```
|
|
|
|
## 📝 Future Plans
|
|
|
|
* **More Data Source Support**: Expand support for more security information sources, such as security communities, vulnerability platforms, etc.
|
|
* **More Refined AI Analysis**: Continuously optimize Prompt engineering to improve the accuracy and depth of AI analysis.
|
|
* **Richer Features**: Such as vulnerability early warning, threat intelligence visualization, custom reports, etc.
|
|
* **Web UI Management Interface**: Develop a Web UI management interface to facilitate users to configure and manage the monitoring system.
|
|
|
|
## 🤝 Contribution
|
|
|
|
Contributions are welcome! If you have any suggestions or bug reports, please submit an Issue or Pull Request.
|
|
|
|
## 📜 License
|
|
|
|
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
|
|
---
|
|
|
|
**Thank you for your attention!** ⭐ **Star** this project to support our work!
|