mrh
|
0a0b65e876
完成 dp 最后一页的检查
|
10 months ago |
mrh
|
d09e9d56ca
修改 excel 导入管理
|
10 months ago |
mrh (aider)
|
77bd06e99f
refactor: Convert excel_import functions to class-based approach
|
10 months ago |
mrh (aider)
|
333e4f2d83
feat: add progress tracking and management for Excel keywords
|
10 months ago |
mrh
|
d3642e09b5
改变数据模型架构
|
10 months ago |
mrh (aider)
|
64f0fb921d
fix: use sqlalchemy.text() for raw SQL execution in drop_table
|
10 months ago |
mrh
|
1c46f2b7ce
refactor: Make `drop_table` function accept a model parameter
|
10 months ago |
mrh (aider)
|
e8250fe763
refactor: modify drop_table to only delete SearchResult table
|
10 months ago |
mrh (aider)
|
5a07937f67
feat: add is_last_page field to track search pagination end
|
10 months ago |
mrh
|
becc835c83
refactor: Add debug print and database initialization in SearchManager
|
10 months ago |
mrh
|
dfca410425
新增日志库
|
10 months ago |
mrh (aider)
|
3143d25dab
feat: add cache parameter to search methods for database lookup
|
10 months ago |
mrh (aider)
|
51bee11ffb
refactor: extract database save logic into separate method in SearchManager
|
10 months ago |
mrh
|
f5db96a546
refactor: remove unused methods and adjust return value in search_manager
|
10 months ago |
mrh (aider)
|
4ce06ce2c8
refactor: move database logic to DatabaseManager and add duplicate checking
|
10 months ago |
mrh
|
03506cf2d5
默认的 crawl4ai 有反爬检测,新增一个 Drission 管理看看如何规避
|
10 months ago |
mrh (aider)
|
a33426c675
fix: Pass datetime.now as callable to default_factory in SearchResult
|
10 months ago |
mrh
|
8ab84386e7
refactor: Rename `Keyword` to `SearchResult` and add `DatabaseManager` class
|
10 months ago |
mrh
|
ad8aa1e1f0
循环读取关键词并搜索。新问题:页面存在却提示无结果
|
10 months ago |
mrh
|
c9d00c553e
将 excel 导入数据库
|
10 months ago |
mrh
|
78e12d7b83
加入 dristion page 后关键词搜索
|
10 months ago |
mrh
|
c0573ee7ad
crawlai 似乎不支持自定义浏览器,只能用内置的。方案1:用外置浏览器爬取网页后再给 crawlai 解析。方案2:看看能否设置 User-agent 、session、cookie
|
10 months ago |