<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://machiry.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://machiry.github.io/" rel="alternate" type="text/html" /><updated>2026-02-04T04:49:33-08:00</updated><id>https://machiry.github.io/feed.xml</id><title type="html">Aravind Machiry @ PurS3 Lab</title><subtitle>Assistant Professor at Purdue University</subtitle><author><name>Aravind Machiry</name><email>amachiry@purdue.edu</email></author><entry><title type="html">Trivial Suggestions: Doing Effective Related Work (Part 1)</title><link href="https://machiry.github.io/posts/2021/05/doing-related-work-part1/" rel="alternate" type="text/html" title="Trivial Suggestions: Doing Effective Related Work (Part 1)" /><published>2021-05-27T00:00:00-07:00</published><updated>2021-05-27T00:00:00-07:00</updated><id>https://machiry.github.io/posts/2021/05/related-work-survey-part1-7</id><content type="html" xml:base="https://machiry.github.io/posts/2021/05/doing-related-work-part1/"><![CDATA[<p>When I was a graduate student, I did face several difficulties in dealing with writing related work and organizing references.</p>

<p>This “Trivial Suggestions” post is the first part of a dual part post of managing citations and writing related work section.</p>

<h2 id="handling-citations">Handling Citations</h2>
<p>First, install a citation manager.</p>

<p>There are many citation managers such as: Zotero, Mendely, Refworks, etc.</p>

<p>I use Zotero and am happy with it. The chrome plugin is amazing. You can add a citation to an article or to the list of Google scholar articles with a single click.</p>

<p>Here is a decent tutorial on using Zotero: https://www.youtube.com/watch?v=Hm0TboOcAuM</p>

<p>Also, check out EndNote - Click (Thanks to Keerthi), which downloads PDFs (which need special access) using your university account.</p>

<h3 id="advantages-of-using-citation-managers">Advantages of using citation managers:</h3>

<ul>
  <li>Organization: You can perform high-level categorize (e.g., into folders) of all the works.</li>
</ul>

<blockquote>
  <p>For instance: I can organize all the works related to bootloaders based on the type of problem they focus on. E.g., <code class="language-plaintext highlighter-rouge">Bootloader</code> (Top most level) <code class="language-plaintext highlighter-rouge">-&gt;</code> <code class="language-plaintext highlighter-rouge">Defenses</code>, <code class="language-plaintext highlighter-rouge">Attacks</code>, etc.</p>
</blockquote>

<ul>
  <li>Interoperability: It can easily port to various formats without reformatting the works. You can dump all your related works by a single button into the desired format and put it into your proposal, paper, etc.</li>
</ul>

<h2 id="how-to-search-for-related-work">How to search for related work:</h2>

<p>First, have answers to at least the following questions:</p>

<ul>
  <li>What is the problem you are trying to solve?</li>
</ul>

<blockquote>
  <p>E.g., “Finding vulnerabilities in bootloaders”, “Helping students learn better programming”, “Automatically understanding human emotions from their voice”, etc.</p>
</blockquote>

<p>Once you know the problem:</p>

<p>– Find techniques (hopefully, other than yours) people have used to solve it.</p>

<p>– Find works that show that the problem is important.</p>

<ul>
  <li>What techniques are you trying to use to solve the problem?</li>
</ul>

<blockquote>
  <p>E.g., “Static analysis”, “Fuzzing”, etc.</p>
</blockquote>

<p>Once you figured out the technique:</p>

<p>– Find other problems which most commonly use the technique.</p>

<p>– Find works that introduced the technique.</p>

<h2 id="how-old-should-the-related-work-be">How old should the related work be?</h2>

<p>How far back (chronologically) should we go to consider a work to be relevant?</p>

<p>This depends on the specific stream and how active is the area of research. E.g., for machine learning, with its ultra-active area of research, anything older than three years can be (or maybe) considered irrelevant.</p>

<p>For system security, I suggest five years. However, this again depends on the specific problem and approach you are trying to use. Maybe you are using a very old approach (say ten years old) for a new problem. In that case, even though the work is old, you should cite the paper proposing the old approach.</p>

<p>In Part 2, we will see how to write a good related work section.</p>]]></content><author><name>Aravind Machiry</name><email>amachiry@purdue.edu</email></author><category term="TrivialSuggestions" /><category term="Academia" /><category term="Writing" /><category term="RelatedWork" /><summary type="html"><![CDATA[When I was a graduate student, I did face several difficulties in dealing with writing related work and organizing references.]]></summary></entry><entry><title type="html">Setting up DLXOS (ECE 469) on Cent OS 7</title><link href="https://machiry.github.io/posts/2021/01/dlxos-centos-7/" rel="alternate" type="text/html" title="Setting up DLXOS (ECE 469) on Cent OS 7" /><published>2021-01-19T00:00:00-08:00</published><updated>2021-01-19T00:00:00-08:00</updated><id>https://machiry.github.io/posts/2021/01/setting-up-dlxos-centos-7</id><content type="html" xml:base="https://machiry.github.io/posts/2021/01/dlxos-centos-7/"><![CDATA[<p>This post describes the steps to setup DLXOS needed for Purdue ECE 469 labs on Cent OS 7 VirtualBox VM.</p>

<p>There are three steps here: Setting up VM, Install dependencies, and Setting up DLXOS tools.</p>

<h2 id="vm-setup">VM Setup</h2>
<ol>
  <li>Download the <a href="https://sourceforge.net/projects/linuxvmimages/files/VirtualBox/C/7/CentOS_7.7.1908_VBG.zip/download">Cent OS 7 Virtual Box Image</a></li>
  <li>Extract the above folder and <a href="https://docs.oracle.com/cd/E26217_01/E26796/html/qs-import-vm.html">import the .ova VM</a>
    <blockquote>
      <p>You may need to change the network adapter to NAT</p>
    </blockquote>
  </li>
</ol>

<blockquote>
  <p>Login into the VM with username: centos and password: centos</p>
</blockquote>

<blockquote>
  <p>The following steps have to be run inside the VM.</p>
</blockquote>

<h2 id="install-dependencies">Install Dependencies</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo yum install glibc.i686
sudo yum install libstdc++.so.5
</code></pre></div></div>

<h2 id="setting-up-dlxos-tools">Setting up DLXOS tools</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir ~/ee469
cd ~/ee469
scp [your-account]@ecegrid.ecn.purdue.edu:~ee469/labs/common/dlxos_new.tar.gz .
tar -xvzf dlxos_new.tar.gz
</code></pre></div></div>
<h3 id="setting-up-the-path">Setting up the PATH</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gedit ~/.bashrc
# at the end of the file append the following line
export PATH=~/ee469/dlxos_new/bin:$PATH
(save and clode gedit)
</code></pre></div></div>

<p>That’s it. You are all set.</p>

<p>To test, open a terminal and run <code class="language-plaintext highlighter-rouge">dlxsim</code>, you should see some help message.</p>]]></content><author><name>Aravind Machiry</name><email>amachiry@purdue.edu</email></author><category term="DLXOS" /><category term="ECE 469" /><summary type="html"><![CDATA[This post describes the steps to setup DLXOS needed for Purdue ECE 469 labs on Cent OS 7 VirtualBox VM.]]></summary></entry><entry><title type="html">Kernel Debugging Ubuntu 16.04</title><link href="https://machiry.github.io/posts/2020/08/ubuntu-kernel-debug/" rel="alternate" type="text/html" title="Kernel Debugging Ubuntu 16.04" /><published>2020-08-24T00:00:00-07:00</published><updated>2020-08-24T00:00:00-07:00</updated><id>https://machiry.github.io/posts/2020/08/kernel-debugging-ubuntu16.04</id><content type="html" xml:base="https://machiry.github.io/posts/2020/08/ubuntu-kernel-debug/"><![CDATA[<blockquote>
  <p>All the following steps are tested on Ubuntu 16.04</p>
</blockquote>

<p>There are 3 steps here: Building qemu image, building and installing kernel, debugging</p>

<h2 id="building-qemu-image">Building qemu image</h2>
<h3 id="install-qemu">Install qemu</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install qemu-kvm qemu virt-manager virt-viewer libvirt-bin
</code></pre></div></div>
<h3 id="create-a-qcow-disk">Create a qcow disk</h3>
<p>Here, we will create a virtual disk on which ubuntu will be installed.</p>
<blockquote>
  <p>Why qcow2? not regular image? 
Because we can increase the size of qcow later, but increasing the size of regular image is tricky.</p>
</blockquote>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>qemu-img create -f qcow2 ubuntu16.04.qcow 40G
</code></pre></div></div>

<h3 id="download-ubuntu-image">Download ubuntu image</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wget https://releases.ubuntu.com/16.04/ubuntu-16.04.7-desktop-amd64.iso
</code></pre></div></div>
<h3 id="installing-ubuntu-1604-on-the-disk">Installing Ubuntu 16.04 on the disk</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>qemu-system-x86_64 -hda ubuntu16.04.qcow -boot d -cdrom ./ubuntu-16.04.7-desktop-amd64.iso -device virtio-net,netdev=vmnic -netdev user,id=vmnic -m 4G
</code></pre></div></div>
<p>This will open a window on which you follow the instructions to complete the installation.</p>

<h2 id="build-and-install-kernel">Build and install kernel</h2>
<blockquote>
  <p>Follow these instructions on the host machine</p>
  <h3 id="clone-the-kernel-sources">Clone the kernel sources</h3>
  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone git://kernel.ubuntu.com/ubuntu/ubuntu-xenial.git
cd ubuntu-xenial
# check out the required kernel
git checkout tags/ubuntu-hwe-4.15.0-112.113_16.04.1
</code></pre></div>  </div>
  <h3 id="configure">Configure</h3>
  <h4 id="get-the-default-config">get the default config</h4>
  <p>Copy the config from the QEMU vm, you can find the config at the following path on the <strong>vm</strong>.</p>
  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/boot/config-`uname -r`
</code></pre></div>  </div>
  <p>Lets say you got the config out from the VM into the host as <code class="language-plaintext highlighter-rouge">ubuntu16.04config</code></p>
</blockquote>

<p>Now copy the config as <code class="language-plaintext highlighter-rouge">.config</code> in the <code class="language-plaintext highlighter-rouge">ubuntu-xenial</code> (i.e., folder where we checked out our kernel sources) i.e.,</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cp ubuntu16.04config &lt;path to ubuntu-xenial&gt;/.config
</code></pre></div></div>
<h4 id="modify-the-config-optional">Modify the config (Optional)</h4>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>make menuconfig
</code></pre></div></div>
<p>This will open a window where you can enable or disable additional kernel configuration options.</p>

<h3 id="building-kernel">Building kernel</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>	chmod a+x debian/scripts/*
	chmod a+x debian/scripts/misc/*
	cp debian/scripts/retpoline-extract-one scripts/ubuntu-retpoline-extract-one
	make deb-pkg
</code></pre></div></div>
<h3 id="installing-kernel-on-to-the-guest">Installing kernel on to the guest</h3>
<p>Copy all <code class="language-plaintext highlighter-rouge">*.deb</code> from host to guest and install the built kernel into the <strong>vm</strong>.</p>
<blockquote>
  <p>You should run the following command in guest VM</p>
  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo dpkg -i linux-image-&lt;..&gt;.deb
sudo dpkg -i linux-headers-&lt;..&gt;.deb
</code></pre></div>  </div>
</blockquote>

<h2 id="debugging-guest-vm">Debugging Guest VM</h2>
<p>First, run the QEMU vm and make qemu wait for the debugger using the following command:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>qemu-system-x86_64 -s -S -hda ubuntu16.04.qcow -device virtio-net,netdev=vmnic -netdev user,id=vmnic -m 4G -enable-kvm -append "console=ttyS0"
</code></pre></div></div>
<p>This will cause the qemu wait untill the debugger gets attached.</p>

<p>Now in an other terminal window</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Go to the folder where we built the kernel
cd &lt;path to ubuntu-xenial&gt;
# gdb
&gt; file vmlinux
&gt; target remote:1234
# You are inside debugger and see that the break point is being hit.
</code></pre></div></div>
<p>Thats it!! You can use the regular gdb commands from now on.</p>

<h2 id="references">References</h2>
<p>[1] https://wiki.gentoo.org/wiki/QEMU/Options
[2] https://help.ubuntu.com/community/Kernel/Compile#Alternate_Build_Method_.28B.29:_The_Old-Fashioned_Debian_Way</p>]]></content><author><name>Aravind Machiry</name><email>amachiry@purdue.edu</email></author><category term="Kernel Debugging Setup" /><category term="Ubuntu 16.04" /><summary type="html"><![CDATA[All the following steps are tested on Ubuntu 16.04]]></summary></entry><entry><title type="html">Making Kernel Drivers Great Again</title><link href="https://machiry.github.io/posts/2017/09/kernel-drivers/" rel="alternate" type="text/html" title="Making Kernel Drivers Great Again" /><published>2017-09-23T00:00:00-07:00</published><updated>2017-09-23T00:00:00-07:00</updated><id>https://machiry.github.io/posts/2017/09/making-kernel-drivers-great-again</id><content type="html" xml:base="https://machiry.github.io/posts/2017/09/kernel-drivers/"><![CDATA[<p>Project: MKDGA</p>

<p>Kernel drivers were once good. A few years ago (circa 2008), Security issues in the Linux kernel were mostly in the non-driver components. Most of us thought Linux kernel is getting better w.r.t security.</p>

<p>In the year 2010, Android came into popularity. Hundreds of vendors started quickly producing android compliant devices. Competition between the vendors became fierce and time to the market became an important factor to capture the growing market.
Android uses Linux kernel as its core. Vendors write drivers to support their Hardware. However, because of Factor 1, These drivers were not <em>properly</em> vetted, resulting in drivers becoming the bug-prone components of the Android kernel [1]. If you take a look at the CVEs [2] most of these bugs are embarrassing, it is incredible that such code even exists.</p>

<p>I want to solve this problem and make Linux kernel drivers great again. 
My grand plan:
1) Develop a precise static analysis technique that can find easy bugs.</p>

<p>Before actually developing yet another static bug finding tool, I wanted to check, how the existing tools perform on the android kernel drivers. The results are not good, a huge number of warnings and few times even the code as simple as below snippet raises multiple warnings.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>char buf[100];
strcpy(buf, "Hello");
</code></pre></div></div>
<p>Although, I understand that I should <em>never</em> use strcpy,  but still the above code is fine.
We need a tool that can spot easy bugs with low false positives (&lt; 20%). By easy I mean, memory corruption vulnerabilities triggerable by the user data. In program analysis lingo, these are called Taint based vulnerabilities. 
Myself along with few amazing people from UC Santa Barbara developed this tool called <strong>DR.CHECKER (published at USENIX Security 2017)</strong> which tries to achieve exactly this in a completely automated way. Furthermore, it has amazing UI, where you can see exactly how user data could cause a reported vulnerability.</p>

<p>Refer:https://github.com/ucsb-seclab/dr_checker , for the usage guide.
2) Develop a smart fuzzer customized for the drivers.</p>

<p>While looking up existing work on fuzzing Linux kernel fuzzers, I found syzkaller by Google, which truly is a masterpiece and gold standard for fuzzing Linux kernel syscalls. However, one problem with it is that it requires the specification of driver interface. Such as device name, possible ioctl cmd ids and corresponding structures.
Although this information could be easily specified by the driver developers, it is a non-trivial task for a security analyst to do this. We developed a technique called <strong>DIFUZE (going to be published at CCS 2017)</strong> which retrieves the driver interface in an automated way. These interfaces could be used in syzkaller (recommended) or use our simple fuzzer called MangoFuzz to fuzz the drivers.</p>

<p>3) Develop a website where people can submit their kernel.tar.gz and it gives a self-contained docker image customized to analyze the kernel sources both statically and dynamically with a single command run.py.</p>

<p>I registered the domain drchecker.io to integrate DR.CHECKER and DIFUZE into a self-contained docker image, for the analysts to use.</p>

<p>I will be working on this, whenever I find free time. Any additional help is greatly appreciated.
Please do not hesitate to contact me for any details.</p>

<p>References:</p>

<p>[1] https://events.linuxfoundation.org/sites/events/files/slides/Android-%20protecting%20the%20kernel.pdf</p>

<p>[2] https://source.android.com/security/bulletin/</p>

<p>[3] https://github.com/google/syzkaller</p>]]></content><author><name>Aravind Machiry</name><email>amachiry@purdue.edu</email></author><category term="Linux Kernel" /><category term="Vulnerability Detection" /><summary type="html"><![CDATA[Project: MKDGA]]></summary></entry><entry><title type="html">The need for Extensible and configurable Static Taint Tracking for C/C++</title><link href="https://machiry.github.io/posts/2017/05/static-taint-tracking/" rel="alternate" type="text/html" title="The need for Extensible and configurable Static Taint Tracking for C/C++" /><published>2017-05-31T00:00:00-07:00</published><updated>2017-05-31T00:00:00-07:00</updated><id>https://machiry.github.io/posts/2017/05/static-taint-tracking</id><content type="html" xml:base="https://machiry.github.io/posts/2017/05/static-taint-tracking/"><![CDATA[<h1 id="update">Update:</h1>
<blockquote>
  <p>There is an open-source extensible framework: https://phasar.org/</p>
</blockquote>

<p>Taint Tracking, as the name implies is a technique to tracks the “taint” of the data throughout the program. The taint of the data is usually a binary attribute, as such can have Boolean values true/false or 1/0. There are other possible representations of the taint, which we ignore for simplicity. Most often taint is used to indicate whether the data is “controlled” by the user or not. Refer [1] for a comprehensive treatment of taint tracking.</p>

<p>One of the most common use case of taint tracking is input validation vulnerability detection. i.e., checking whether the tainted data can reach a program point (or sensitive function) that expects untainted or non-tainted data. For ex: using tainted string as the source string in a strcpy call,  this can lead to overflow of the destination buffer.</p>

<p>Depending on the method of tracking, Taint Tracking techniques are classified as dynamic or static.</p>

<p>In the case of Dynamic taint tracking, the program is instrumented with taint propagation instructions along with checks to make sure that tainted data does not reach sensitive functions. Dynamic taint tracking is the popular choice for taint tracking. As such there are many tools available to perform dynamic taint tracking on Binaries[3, 4], C/C++ using LLVM [5], Java[6], etc. But, Dynamic Taint Tracking suffers from same disadvantages as any dynamic analysis techniques like Input generation, Speed, etc. Refer [2] for more details about the disadvantages of Dynamic analysis techniques.</p>

<p>However, In the case of Static taint tracking, standard data-flow techniques are used to propagate taint and warnings are raised when a tainted data may reach a sensitive function. Static taint tracking is not popular. There are only a few tools available for Java, Binaries, Web, etc.</p>

<p>One interesting thing to note here is that there is No usable static taint tracking tool available for C/C++. Few works try to achieve this, but they are either discontinued [7, 8] or not extensible [7]. 
One work that comes close to achieving this is by Marcelo [9], where they modify the clang static analyzer to perform taint tracking. But clang has disadvantages as in it cannot analyze more than one source file, and it does not have access to the LLVM analyses which are helpful to do interesting stuff.</p>

<p>The need of the hour is to <strong>have a static taint tracking as LLVM pass</strong>. It is sad to see that a multi-decade technique is not available for the languages for which it is most applicable.</p>

<p>Lack of an extensible and configurable static taint tracking is an open opportunity ignored by the academia. Anyone willing to take up Static taint tracking for C/C++ using LLVM as their project? I am with you and can help you in all stages of the project.</p>

<p>Good to know: The compilation flag -gsrc to clang produces a bitcode file with accurate source lines information.</p>

<p>Cheers.</p>

<p>[1] All You Ever Wanted to know about Dynamic Taint Tracking: https://users.ece.cmu.edu/~aavgerin/papers/Oakland10.pdf</p>

<p>[2] Table 1 of the pdf: https://link.springer.com/chapter/10.1007/978-3-319-11933-5_13</p>

<p>[3] libdft: http://www.cs.columbia.edu/~vpk/research/libdft/</p>

<p>[4] Google: “Dynamic Taint Tracking for binaries.”</p>

<p>[5] DataFlowSanitizer: http://clang.llvm.org/docs/DataFlowSanitizer.html</p>

<p>[6] Google: “dynamic taint tracking for java”</p>

<p>[7] Context sensitive static taint tracking: https://ece.uwaterloo.ca/~xnoumbis/noumbissi-thesis.pdf</p>

<p>[8] https://github.com/dceara/tanalysis/tree/master/tanalysis</p>

<p>[9] https://www.researchgate.net/publication/312938554_An_User_Configurable_Clang_Static_Analyzer_Taint_Checker</p>]]></content><author><name>Aravind Machiry</name><email>amachiry@purdue.edu</email></author><category term="Static analysis" /><category term="LLVM" /><summary type="html"><![CDATA[Update: There is an open-source extensible framework: https://phasar.org/]]></summary></entry></feed>