Research on Webshell Detection Methods Based on Feature Engineering and Threat Intelligence

doi:10.11871/jfdc.issn.2096-742X.2022.05.009

Abstract

Abstract:

[Objective] Webshell is an executable script generated by implanting a Trojan horse through injection, XSS, upload, and other vulnerability penetration means. Because of the difference in construction language, variable exploitation methods, and stealthy nature, the study of the Webshell detection methods is demanded, which can accurately discover the malicious attack behavior of infiltrating and invading websites, and is of positive significance in early warning, research and judgment, and combating hacker-like cases such as illegal invasion of computer information systems. [Methods] This paper proposes an innovative method to study behavioral data and extract features based on Webshell malicious code, and implements the feature-based Webshell detection and network security threat intelligence modeling experiments and applications for HTTP traffic. [Results] The results from experiments and actual deployment show that the extracted feature values can identify Webshells with high accuracy and can effectively detect malicious attacks. [Conclusions] Although the detection method based on feature engineering has the disadvantage of heavy maintenance, it achieves higher accuracy and efficiency in detecting known specific attacks, which is very valuable in the practical application of preventing and combating hacking crimes.

Key words: hacking crime, Webshell, HTTP protocol, feature engineering, cybersecurity threats

XU Bo,JIANG Zhengwei,XIN Liling,ZHOU Yufei. Research on Webshell Detection Methods Based on Feature Engineering and Threat Intelligence[J]. Frontiers of Data and Computing, 2022, 4(5): 77-86, https://cstr.cn/32002.14.jfdc.CN10-1649/TP.2022.05.009.

Figures/Tables 9

Table 1

Table 2

Table 3

Table 4

Table 5

Fig.1

Table 6

Table 7

Table 8

References 16

[1]	明乐齐. 网络黑客犯罪的趋势与防范对策[J]. 山东警察学院学报, 2020, 32(01):104-113.
[2]	国家计算机网络应急技术处理协调中心. 2021年上半年我国互联网网络安全监测数据分析报告[R]. 2021-07-31.
[3]	“净网2021”专项行动成绩单发布[J]. 廉政瞭望, 2022(Z1):14.
[4]	肖建平, 龙春, 赵静, 魏金侠, 胡安磊, 杜冠瑶. 基于深度学习的网络入侵检测研究综述[J]. 数据与计算发展前沿, 2021, 3(03):59-74.
[5]	李娜. 基于污点分析的WebShell检测研究[J]. 电子技术与软件工程, 2022(07):198-201.
[6]	易楠, 方勇, 黄诚, 刘亮. 基于语义分析的Webshell检测技术研究[J]. 信息安全研究, 2017, 3(02):145-150.
[7]	T. Dinh Tu, C. Guang, G. Xiaojun and P. Wubin. "Web-shell detection techniques in web applications"[C]. Fifth International Conference on Computing, Communic ations and Networking Technologies (ICCCNT), 2014: 1-7.
[8]	何树果, 张福, 朱震, 程度. Webshell检测方案探索与实践[J]. 信息网络安全, 2020(S1):141-144.
[9]	白肇强. 基于用户行为的特征工程构建与应用研究[D]. 华南理工大学, 2018.
[10]	李源, 王运鹏, 李涛, 马宝强. 基于多特征融合的Web-shell恶意流量检测方法[J]. 网络与信息安全学报, 2021, 7(06):143-154.
[11]	李江涛. 基于机器学习组合算法的Webshell检测方法与实现[D]. 山东大学, 2019.
[12]	王跃达, 黄潘, 荆涛, 宋雅稀. 一种基于高速网络的Webshell综合检测回溯技术研究与实现[J]. 信息网络安全, 2021, 21(01):65-71.
[13]	马泽辉. 基于逻辑回归算法的Webshell检测方法研究[J]. 信息安全研究, 2019, 5(04):298-302.
[14]	王世通. 基于HTTP协议的Webshell检测研究[D]. 北京邮电大学, 2021.
[15]	常昊, 陈岑, 张铮, 李鸣岩. 基于文本特征和日志分析的Webshell检测[J]. 网络安全技术与应用, 2022(02):10-12.
[16]	刘亮, 赵倩崇, 郑荣锋, 田智毅, 孙思琦. 基于威胁情报的自动生成入侵检测规则方法[J]. 计算机工程与设计, 2022, 43(01):1-8.

请求行	POST /index.php HTTP/1.1	请求方法请求URL HTTP协议及版本
请求头	User-Agent： image/jpeg	产生请求的操作系统版本、浏览器类型等信息
	Accept：zh-CN	客户端可识别、处理的内容类型
	Host： 127.0.0.1:8088	请求的主机名,允许多个域名同处一个IP地址,即虚拟主机
	空行	区分首部与实体,通知服务器以下不再是请求头
请求数据	Name=admin& password=123456	只存在POST方法中

木马类型	常见木马名称
PHP木马	index1.php、index.php、phpmyadmin.php、404.php
ASPX木马	WebAdmin2xE.aspx
JSP木马	404.jsp、t00ls.jsp、terms.jsp、test3693、wartree.jsp
ASP木马	x.asp、xx.asp、rootkit.asp、90sec.asp、osec.asp

访问Webshell的密码	该密码实际为PHP等类型语言的参数传递;
Webshell执行的命令	一句话木马、“小马”传递的命令参数可能为加密乱码,“大马”及get请求常见URL明文参数;
Webshell建立连接使用的字符编码类型	如GB2312。

特征值	事件名称	验证类型
base64_decode	PHP木马	POST文件体
action=B z1=	cknife菜刀	POST文件体
response.write response.end eval	ASP菜刀链接	POST文件体
@eval $_POST z0=	PHP菜刀链接	POST文件体
eval z1= z2=	菜刀变种	POST文件体
Paint.NET V3.5.11G	菜刀连接	POST文件体
5061696E742E4E45542076332E352E313147	菜刀连接	POST文件体
“/cgi-bin/frame.cgi?p=system_admin&a=LOGIN”	境外黑客攻击	POST文件体
eval POST stripslashes	菜刀变种	POST文件体
admin.php password frames	discuz论坛收集	POST文件体
z0=UTF-8	菜刀UTF8	POST文件体
eval z0= z9=BaSE64_dEcOdE	菜刀链接	POST文件体
ute isnumeric z2=	菜刀链接	POST文件体
edoced_46esab xise z0=	XISE菜刀链接	POST文件体
array_map @ev	新菜刀特征	POST文件体
o=filelist folder=	JSPSPY查看文件行为	POST文件体
(response.write & response.end) & (z1= \| z2= \| z3=)	ASPX菜刀链接	POST文件体
eval z0= z1=	菜刀链接	POST文件体
z0=GB2312	菜刀连接（编码GB2312 ）	POST文件体
“eval(System.Text.Encoding.GetEncoding(“	Cknife后门管理工具aspx2	POST文件体
“@eval($_POST[“	上传一句话木马事件	POST文件体

特征值	事件名称	验证类型
.jsp?Action=command	JSP大马	URL
arry_map ;@ev	菜刀102019	URL
JspSpyPwd	JspSpy木马事件	URL
DarkBladePass= goaction=	DarkBlade木马登录	URL
“21\|23\|25\|80\|110\|135\|139\|445\|1433\|3306\|3389\|43958”	phpspy	URL
Action=getTerminalInfo	ASP木马获取服务器信息行为	URL
ASPXSpy	Xspy木马	URL
arry_map ;@ev	菜刀	URL