Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Request ¶
type Request struct {
Url string
Proxy string
Cookie []*http.Cookie
MaxRetryNum int
Meta map[string]any
Headers map[string]string
}
Request 请求
type Response ¶
type Response struct {
Request Request
Error error
Content string
Meta map[string]any
StatusCode int
Xpath Xpath
}
Response 响应
type Spider ¶
type Spider struct {
RequestQueue chan Request
ResponseQueue chan Response
Client *http.Client
Transport *http.Transport
WorkerNum int
Stat spiderStat
// contains filtered or unexported fields
}
Spider 爬虫
func (*Spider) RandTransport ¶
func (s *Spider) RandTransport()
RandTransport 方法用于为Spider结构体生成随机的http.Transport
该方法会设置Transport的DisableKeepAlives字段为true,禁用长连接 同时会设置TLSClientConfig字段,包括跳过TLS证书验证、设置TLS协议版本范围、设置密码套件以及设置客户端会话缓存大小
type Xpath ¶ added in v0.1.1
Xpath Xpath解析html
func NewXpathParser ¶ added in v0.1.1
NewXpathParser Xpath构造函数
func (*Xpath) ExtractFirst ¶ added in v0.1.1
ExtractFirst 获取符合条件的第一个节点的文本内容
Click to show internal directories.
Click to hide internal directories.