CodeQL是一种分析引擎，其首先将code转换为database，再通过类QL的方式挖掘漏洞。LGTM基于CodeQL提供在线服务。

漏洞类型

数据流分析和模式匹配类型漏洞

覆盖语言

• C/C++
• C#
• Go
• Java
• Python
• JavaScript（Includes JSX and Flow code, YAML, JSON, HTML, and XML files）
• TypeScript[New]
详情见：https://help.semmle.com/codeql/supported-languages-and-frameworks.html#

用法

CLI 安装

下载三个组件，放在code-home目录下

$ tree -L 1
.
├── codeql          # codeql引擎，https://github.com/github/codeql-cli-binaries/releases
├── codeql-go       # go解析，https://github.com/github/codeql-go/
└── codeql-repo # 规则，https://github.com/github/codeql

验证，查询规则集：

$ ./codeql/codeql resolve qlpacks
codeql-cpp (/Users/xuwenyuan/workspace/codeql-home/codeql-repo/cpp/ql/src)
codeql-cpp-tests (/Users/xuwenyuan/workspace/codeql-home/codeql-repo/cpp/ql/test)
codeql-cpp-upgrades (/Users/xuwenyuan/workspace/codeql-home/codeql-repo/cpp/upgrades)
codeql-csharp (/Users/xuwenyuan/workspace/codeql-home/codeql-repo/csharp/ql/src)
codeql-csharp-tests (/Users/xuwenyuan/workspace/codeql-home/codeql-repo/csharp/ql/test)
...

构建被测项目数据库

对于解释型语言编写的项目：

1	codeql database create --language=<language-identifier> --source-root <folder-to-extract> <database>

例如：

1	codeql database create --language=javascript codeql-database

对于编译型语言：

1	codeql database create --language=cpp <output-folder>/cpp-database --command=make

command不指定则会使用自带的默认编译

特殊的identifier：

c/c++：cpp
c#：csharp
JavaScript/TypeScript：javascript

值得一提的是，LGTM的实现中会同时调用所有语言构建。

发起扫描

1	codeql database analyze <database> <queries> --format=<format> --output=<output> --threads=<threads>

例如（注意这里选择了lgtm的规则集）：

1	codeql database analyze ./codeql-database/ ~/workspace/codeql-home/codeql-repo/javascript/ql/src/codeql-suites/javascript-lgtm-full.qls --format=csv --output=codeql.csv --threads=10

规则编写

CodeQL编写指南： https://github.com/github/codeql/blob/master/docs/ql-style-guide.md，https://help.semmle.com/QL/learn-ql/，以下代码为一例利用污点分析检测XSS的检测规则：

import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.security.XSS
import DataFlow2::PathGraph
class XSSConfig extends TaintTracking::Configuration2 {
  XSSConfig() { this = "XSSConfig" }
  override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }
  override predicate isSink(DataFlow::Node sink) { sink instanceof XssSink }
  override predicate isSanitizer(DataFlow::Node node) {
    node.getType() instanceof NumericType or node.getType() instanceof BooleanType
  }
}
from DataFlow2::PathNode source, DataFlow2::PathNode sink, XSSConfig conf
where conf.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "Cross-site scripting vulnerability due to $@.",
  source.getNode(), "user-provided value"

用CodeQL做分析的介绍其他研究者已经写过很多了，这里不做介绍