@@ -105,12 +105,8 @@ As for false positives, we manually verified all false positive cases in Fig.~\r
To understand the importance of maliciousness score and the consequences of changing the scores,
we performed a sensitivity analysis by setting the maliciousness score of all explicitly sandboxed functions to 1, regardless of their input arguments. The sensitivity analysis shows that without this fine-grained scoring, we miss an additional 176 detected malware samples (4.5\% additional false negatives) as well as incorrectly flag 73 benign files (1.8\% additional false positives).
\begin{newtext}
\noindent
\textbf{Additional Detected Malware Samples by \sysname.}
While \toolname identified additional 1,485 malware samples in dataset A that are not detected by any existing malware detectors, it does not necessarily mean that they are previously unknown malware.
\end{newtext}
\corr{}{\textbf{Newly Identified Malware Samples by \sysname.}
While \toolname identified additional 1,485 malware samples in Dataset A that are not detected by the 70 antivirus scanners in VirusTotal, it may not mean that they are previously unknown malware.
%\subsubsection{Analysis of Real-world Malware Samples}
We present investigation of two malware samples that are not detected by \corr{any of the previously existing tools}{VirusTotal}. %We investigated these samples in detail to show how \sysname uncovers hidden malicious behaviors.
\noindent
{\bf Sample I: Delivering Payload through Benign Website. }
Fig.~\ref{fig:malware_sample1}-(a) shows the malware in its original form (i.e., obfuscated). Due to the obfuscation, most AVs in VT fail to detect it. We leverage \sysname to deobfuscate the malware and the result is shown in Fig.~\ref{fig:malware_sample1}-(b). We use VT to scan the deobfuscated code and 2 AVs detect it as malware, indicating the obfuscation of the sample successfully avoids detection.
Note that the deobfuscated malware is detected by only 2 AVs, suggesting the {\it limitation of signature-based tools}.
%\subsubsection{Analysis of Real-world Malware Samples}
We present investigation of two malware samples that are not detected by any of the previously existing tools. %We investigated these samples in detail to show how \sysname uncovers hidden malicious behaviors.
\end{figure}
\noindent
{\bf Sample I: Delivering Payload through Benign Website. }
Fig.~\ref{fig:malware_sample1}-(a) shows the malware in its original form (i.e., obfuscated). Due to the obfuscation, most AVs in VT fail to detect it. We leverage \sysname to deobfuscate the malware and the result is shown in Fig.~\ref{fig:malware_sample1}-(b). We use VT to scan the deobfuscated code and 2 AVs detect it as malware, indicating the obfuscation of the sample successfully avoids detection.
Note that the deobfuscated malware is detected by only 2 AVs, suggesting the {\it limitation of signature-based tools}.
The code has several evasive tricks. First, it calculates an MD5 value from an external input (e.g., {\tt\$\_POST}) and only when it matches with the hardcoded MD5 value does decode and execute remote code provided by an attacker \blkcc{1}.
Moreover, Lines 5-15 show that it checks several environment variables (e.g., {\tt\$\_REQUEST}) to identify the right victim \blkcc{2}.
@@ -14,12 +14,8 @@ We evaluated the performance and effectiveness of \sysname using a large set of
\noindent
{\bf Real-world Website Deployments (Dataset A).} To understand \sysname's impact in practice, we ran \toolname on a large dataset of \textbf{1 TB} of files (consisting of 87 real world websites deployed in the wild).
The dataset is provided by a commercial web hosting company that maintains nightly backups of over 400,000 websites. For each backup, Linux Malware Detector~\cite{maldet} is used to scan every file in the backup.
%If any files in a backup is flagged as malware, the entire backup is added to the dataset.
Any website included in our dataset had at least one file flagged as malware.
Hence, the dataset includes both benign and malicious PHP files, some of which are flagged by Linux Malware Detector.
Note that the signature-based tools may have false positives and negatives. We aim to leverage \toolname to
%{discover undetected malware as well as to find out false positive cases}
{\color{red} improve accuracy of malware detection}. More details on the dataset can be found in Section~\ref{section:scanning_real_websites}.
\corr{Any website included in our dataset had at least one file flagged as malware. Hence, the dataset includes both benign and malicious PHP files, some of which are flagged by Linux Malware Detector. Note that the signature-based tools may have false positives and negatives. We aim to leverage \toolname to
discover undetected malware as well as to find out false positive cases. More details on the dataset can be found in Section~\ref{section:scanning_real_websites}.}{If any file in a website is flagged as malware, the entire website (i.e., all files of that website) are included in the dataset. If no file in the website is flagged as malware, the website's files are not included in the dataset. Because Linux Malware detector has both false positives and false negatives, flagged files may not be malicious and unflagged files may be malicious. Consequently, the dataset includes both potentially benign and malicious files, at least one of which was flagged as malware by Linux Malware Detector. Section~\ref{section:scanning_real_websites} provides more details regarding the diversity of the dataset.}
\noindent
{\bf Real-world and Synthesized Malware Samples (Dataset B).}
...
...
@@ -103,8 +99,8 @@ they often caused many false positives. % (e.g., both services marked four out o
%Although the first case study provides good insights about \sysname's capabilities and limitations,
We used a large corpus of 87 real-world infected websites consisting of 3,225,403 files (approximately 1 TB) to demonstrate the effectiveness of \sysname.
The dataset includes various malware in the wild
%including previously undetected samples
The dataset includes various malware in the wild
\corr{including previously undetected samples}{}
which show how \sysname can perform against realistic advanced malware.
The websites were collected for analysis because at least one of the files in each website is marked as malicious by Linux Malware Detector (maldet).
Details on file extension distribution in the dataset can be found in Appendix~\ref{appendix:fileexts}.
...
...
@@ -326,7 +322,7 @@ For the sake of our analysis, which aims to cover as many execution paths as pos
\noindent
{\bf Results.}
%Table~\ref{table:false-positives} shows the results.
{\color{red}\toolname does not flag any files as malware when scanning these benign applications, meaning that it has no false positives.}
\toolname does not flag any files as malware when scanning these benign applications, meaning that it has no false positives.
BackdoorMan and PHP Malware Detector emit hundreds of {\it false warnings (categorized as suspicious)} when scanning these applications.
Specifically, BackdoorMan generates 393, 514, 263, and 688 warnings and PHP Malware Detector emits 251, 1141, 36, and 36 warnings for Wordpress, Joomla, phpMyAdmin, and CakePHP respectively.
Note that those warnings are false positives as those applications are all benign.