🔥 CyberSentinel AI - Automated Security Monitoring and AI Analysis System
CyberSentinel AI is an automated security monitoring and AI analysis system designed to track the latest security vulnerabilities (CVEs) and security-related repositories on GitHub in real-time. It leverages Artificial Intelligence (AI) technology for in-depth analysis and automatically publishes valuable security intelligence to a blog platform.
🚀 Key Features
- Multi-Source Data Monitoring:
- CVE Monitoring: Real-time scraping of the latest CVE-related information from GitHub, enabling rapid discovery and tracking of the latest vulnerability trends.
- GitHub Repository Monitoring: Comprehensive monitoring of security-related open-source projects on GitHub through keyword searches and predefined watch lists.
- Intelligent AI Analysis:
- OpenAI & Gemini Dual Engine: Integrated with OpenAI and Gemini AI models, providing powerful natural language processing capabilities for in-depth security data analysis.
- Multi-Dimensional Security Assessment: Evaluation of CVEs and repositories from multiple dimensions, including vulnerability principles, exploitation methods, risk levels, and impact scope, ensuring depth and breadth of analysis.
- Value Judgment and Filtering: Intelligent AI-driven judgment of security information value, automatically filtering out low-value information and focusing on truly noteworthy security threats and technologies.
- Automated Workflow:
- Fully Automated Monitoring: The system runs 24/7 unattended, automating security information collection, analysis, and report generation.
- Daily Security Briefing: Generates daily security briefing reports on a schedule, summarizing the latest CVE vulnerabilities and GitHub security repository dynamics, and pushing them to a blog platform.
- Dynamic Blacklisting: Automatically updates blacklists based on AI analysis results, reducing interference from invalid information and improving monitoring efficiency.
- Flexible Configuration and Management:
- Multi-GitHub Token Support: Supports configuration of multiple GitHub Tokens, intelligently rotating usage to effectively avoid API rate limits.
- Configurable Monitoring Parameters: Keywords, watch repository lists, blacklists, etc., can be flexibly adjusted through configuration files to meet different monitoring needs.
- Detailed Logging: Detailed logs are recorded for all critical steps of system operation, facilitating troubleshooting and system monitoring.
- Automated Blog Publishing:
- Integrated Blog Platform: Integrated with a blog platform API to automatically publish daily security briefing reports, quickly sharing security intelligence.
- Markdown Reports: Analysis results and security briefings are generated in Markdown format, making them easy to read and edit.
🛠️ Technical Implementation
1. Monitoring Modules (Monitors)
-
cve_monitor.py: CVE Monitor- GitHub API Interaction: Uses the GitHub API to search for CVE-related repositories, keyword
CVE-202+, and sorts byupdatedtime. - CVE Information Extraction: Extracts CVE numbers from repository names and descriptions using regular expressions.
- Repository Information Crawling: Retrieves repository descriptions, star counts, update times, recent commits, and other information.
- Blacklist Filtering: Supports user blacklists and repository blacklists to filter out invalid information sources.
- File Content Analysis: Clones repositories locally and intelligently analyzes README.md and other high-priority files, calculates file relevance scores, and initially filters high-value repositories.
- Intelligent Token Management: Implements automatic rotation and status checking of GitHub Tokens, dynamically switching available tokens to ensure the continuity of monitoring tasks.
- Database Storage: Uses the SQLite database
database/cve_record.dbto store CVE records, including CVE numbers, descriptions, publication dates, last modified dates, repository URLs, and other information.
- GitHub API Interaction: Uses the GitHub API to search for CVE-related repositories, keyword
-
github_monitor.py: GitHub Repository Monitor- Keyword Search: Periodically searches GitHub repositories based on the
GITHUB_KEYWORDSlist defined in the configuration fileconfig.py. - Watch List: Supports the
WATCHED_REPOSITORIESlist in the configuration fileconfig.pyto focus monitoring on predefined security repositories. - Repository Information Crawling: Retrieves detailed repository information, including descriptions, star counts, last update times, recent commit records, and more.
- Commit Record Analysis: Crawls the recent commit records of repositories, intelligently analyzes commit information and file changes, and initially judges the security relevance of repositories.
- Blacklist Filtering: Supports user blacklists and repository blacklists to filter out invalid information sources.
- Intelligent Token Management: Shares the Token management mechanism with the CVE monitor.
- Database Storage: Uses the SQLite database
database/github_repo.dbto store GitHub repository records, including repository names, URLs, descriptions, last update times, star counts, whether they are high-value repositories, and other information.
- Keyword Search: Periodically searches GitHub repositories based on the
2. AI Analysis Module (AI)
analyzer.py: AI Analyzer- OpenAI & Gemini API: Integrates OpenAI API (primary) and Gemini API (backup), supports multi-model switching, such as
gpt-4o-mini-2024-07-18(fallback model). - Prompt Engineering: Designed different Prompt templates for different analysis scenarios (CVE analysis, new repository analysis, repository update analysis, specific watch repository analysis) to optimize AI analysis results.
- JSON Format Output: Requires AI to strictly output analysis results in JSON format for easy program parsing and data processing.
- Multi-Dimensional Security Analysis: AI analysis results include rich information such as brief descriptions of vulnerabilities/repositories, detailed summaries, risk levels, key points, technical details, affected components, value assessments, security types, update types, and vulnerability exploitation status.
- Result Validation and Standardization: Performs strict format validation and content standardization on the JSON results returned by AI to ensure the accuracy and usability of the data.
- Dynamic Blacklist Update: Based on AI analysis results, automatically judges whether repositories or users should be added to the blacklist and dynamically updates the blacklist file.
- Analysis Result Persistence: Saves AI analysis results as JSON files and updates corresponding records in the database.
- Article Title Classification: Supports AI classification of security article titles for generating security briefing reports.
- API Failover: When OpenAI API calls fail, automatically switches to backup OpenAI API or Gemini API to improve system stability and availability.
- OpenAI & Gemini API: Integrates OpenAI API (primary) and Gemini API (backup), supports multi-model switching, such as
3. Data Processing and Management (Utils)
-
logger.py: Logger- Uses the
loggingmodule to provide complete logging functionality, covering DEBUG, INFO, WARNING, ERROR, and other levels. - Log information is detailed and structured, making it easy to troubleshoot and monitor the system.
- Logs are output to the file
logs/security_monitor.logand rolled over daily.
- Uses the
-
csv_writer.py: CSV Result Writer (currently not used, can be extended)- Provides the function of exporting analysis results to CSV files for easy data analysis and sharing.
-
article_fetcher.py: Article Fetcher- Multi-Source Fetching: Currently supports fetching security articles from BruceFeIix and D洞见 (doonsec) WeChat official accounts.
- Article Title and URL Extraction: Uses regular expressions to extract article titles and URLs from web page content.
- Retry Mechanism: Uses a backoff strategy retry mechanism to improve the stability and success rate of article fetching.
- Article Title Cleaning: Standardizes and cleans article titles, removing redundant markers and formats.
-
article_manager.py: Article Manager- Article De-duplication: Automatically filters processed article URLs to avoid duplicate analysis and pushing.
- AI Classification Result Processing: Processes AI article title classification results and organizes article lists by category.
- Daily Security Briefing Report Generation: Regularly generates Markdown format daily security briefing reports, summarizing the latest security articles and AI analysis results.
- Automated Blog Publishing: Calls the
blog_manager.pymodule to automatically publish daily security briefing reports to a blog platform. - Article Data Persistence: Saves processed URLs and classified articles as JSON files for easy subsequent use and management.
-
blog_manager.py: Blog Manager- Blog Platform API Interaction: Encapsulates common functions for interacting with blog platform APIs, such as creating articles and updating articles.
- Article ID Mapping Management: Records the article IDs of daily security briefing reports on the blog platform for easy subsequent updates and management.
- Automated Blog Publishing: Implements the function of automatically publishing daily security briefing reports to a blog platform.
4. Database (Database)
database/models.py: Database Model Definition- Uses SQLAlchemy to define two data models,
CVERecord(CVE record) andRepository(GitHub repository record), to facilitate data storage and querying. - The database uses SQLite, and the file paths are
database/cve_record.dbanddatabase/github_repo.db.
- Uses SQLAlchemy to define two data models,
5. Configuration File (Config)
config.py: System Configuration File- Centrally manages the system's configuration parameters, such as database paths, API keys, monitoring intervals, keyword lists, blacklists, etc.
- Facilitates users to customize and adjust system behavior.
- Includes the following main configuration items:
DATABASE_PATH: Database file pathMONITOR_INTERVAL: Monitoring cycle interval (seconds)GITHUB_TOKEN: GitHub API Token (supports listGITHUB_TOKENS)GITHUB_KEYWORDS: List of GitHub repository search keywordsWATCHED_REPOSITORIES: List of GitHub repositories to focus on monitoringBLACKLIST_USERS: User blacklistBLACKLIST_REPOSITORIES: Repository blacklistPRIMARY_AI_CONFIG: Primary AI service (OpenAI) configurationBACKUP_AI_CONFIGS: List of backup AI service (OpenAI) configurationsGEMINI_AI_CONFIG: Gemini AI service configurationBLOG_TOKEN: Blog platform API Token
6. Main Program (Main)
main.py: System Main Program- Initializes each module (monitors, AI analyzer, article manager, etc.).
- Starts the monitoring cycle, regularly executing CVE monitoring, GitHub repository monitoring, AI analysis, article crawling, and blog publishing tasks.
- Uses multi-threading to achieve concurrent monitoring and AI analysis, improving system efficiency.
- Exception handling and retry mechanisms ensure stable system operation.
- Status Monitoring Thread: Regularly checks system operating status and records logs.
- Daily Blog Publishing: Regularly automatically publishes daily security briefing reports to a blog platform.
- Command-Line Startup: Users can start and stop the monitoring system via the command line.
⚙️ Running Environment
- Python 3.8+
- Dependencies (see
requirements.txt)
📦 Installation Steps
-
Clone the code repository
git clone [Project Repository Address] cd [Project Directory] -
Install dependencies
pip install -r requirements.txt -
Configure the
config.pyfile- Configure GitHub API Token (
GITHUB_TOKENorGITHUB_TOKENS) - Configure OpenAI API key and Base URL (
PRIMARY_AI_CONFIG,BACKUP_AI_CONFIGS) - Configure Gemini API key and Base URL (
GEMINI_AI_CONFIG) - Configure Blog platform API Token (
BLOG_TOKEN) (if you need to automatically publish to a blog) - Modify other configuration items as needed, such as monitoring interval, keyword list, blacklist, etc.
- Configure GitHub API Token (
-
Run the system
python main.py
📝 Future Plans
- More Data Source Support: Expand support for more security information sources, such as security communities, vulnerability platforms, etc.
- More Refined AI Analysis: Continuously optimize Prompt engineering to improve the accuracy and depth of AI analysis.
- Richer Features: Such as vulnerability early warning, threat intelligence visualization, custom reports, etc.
- Web UI Management Interface: Develop a Web UI management interface to facilitate users to configure and manage the monitoring system.
🤝 Contribution
Contributions are welcome! If you have any suggestions or bug reports, please submit an Issue or Pull Request.
📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
Thank you for your attention! ⭐ Star this project to support our work!