Muli Ben-Yehuda's journal

March 2, 2011

vIOMMU: Efficient IOMMU Emulation

Filed under: Uncategorized — Muli Ben-Yehuda @ 4:01 PM

My colleague Nadav Amit will be presenting his M.Sc. research, which I had the pleasure of helping with, this upcoming Sunday. The summer before last Nadav did a summer internship with my group at the Haifa Research Lab. Nadav’s internship was dedicated to analyzing the IOTLB behavior of ontemporary IOMMUs, and resulted in this WIOSCA paper. In order to analyze IOTLB behavior, we had to first collect traces of how modern operating systems set-up their DMA buffers, and to do that, Nadav developed IOMMU emulation in KVM.

For his M.Sc., Nadav researched how to emulate IOMMUs efficiently, leading to two primary contributions: first, that waiting just a few milliseconds before tearing down an IOMMU mapping can boost performance substantially due to high temporal reuse. Second, that is possible to emulate a hardware device without trapping to the hypervisor on every device interaction, by using a separate core (a sidecore) to run the device emulation code. The full abstract is below, and everyone is invited to the talk.

Direct device assignment, where a guest virtual machine directly interacts with an I/O device without host intervention, is appealing, because it allows an unmodified (non-hypervisor-aware) guest to achieve near-native performance. But device assignment for unmodified guests suffers from two serious deficiencies: (1) it requires pinning of all the guest’s pages, thereby disallowing memory overcommitment,
and (2) it exposes the guest’s memory to buggy device drivers.

We solve these problems by designing, implementing, and exposing an emulated IOMMU (vIOMMU) to the unmodified guest. We employ two novel optimizations to make vIOMMU perform well: (1) waiting a few milliseconds before tearing down an IOMMU mapping in the hope it will be immediately reused (“optimistic teardown”), and (2) running the vIOMMU on a sidecore, and thereby enabling for the first time the use of a sidecore by unmodified guests. Both optimizations are highly effective in isolation. The former allows bare-metal to achieve 100% of a 10Gbps line rate. The combination of the two allows an unmodified guest to do the same.


February 3, 2011

new paper accepted

Filed under: Uncategorized — Muli Ben-Yehuda @ 5:40 PM

Our paper “Ginkgo, Automated, Application-Driven Memory Overcommitment for Cloud Computing” has been accepted to the ASPLOS RESoLVE workshop. Here is the abstract:

Continuous advances in multicore and I/O technologies have caused memory to become a very valuable sharable resource that limits the number of virtual machines (VMs) that can be hosted in a single physical server. While today’s hypervisors implement a wide range of mechanisms to overcommit memory, they lack memory allocation policies and frameworks capable of guaranteeing levels of quality of service to their applications.

In this short paper we introduce Ginkgo, a memory overcommit framework that takes an application-aware approach to the problem. Ginkgo dynamically estimates VM memory requirements for applications without user involvement or application changes. Ginkgo regularly monitors application progress and incoming load for each VM, using this data to predict application performance under different VM memory sizes. It automates the distribution of memory across VMs during runtime to satisfy performance and capacity constraints while optimizing towards one of several possible goals, such as maximizing overall system performance, minimizing application quality-of-service violations, minimizing memory consumption, or maximizing profit for the cloud provider.

Using this framework to run the benchmarks DayTrader 2.0 and SPECweb2009, our initial experimental results indicate that overcommit ratios of at least 2x can be achieved while maintaining application performance, independently of additional memory savings that can be enabled by techniques such as page coalescing.

I will post the final version of the paper on the publications page when it will be ready.

First post!

Filed under: Uncategorized — Muli Ben-Yehuda @ 4:46 PM

Well, it was bound to happen sometime. I have a new blog.

October 5, 2010

Filed under: Uncategorized — Muli Ben-Yehuda @ 5:14 PM

You know it’s a good day when you get to use angry turtle in a presentation. Angry turtle is angry!

angry turtle

October 4, 2010

Filed under: Uncategorized — Muli Ben-Yehuda @ 8:57 AM

I was hurrying down the Newark airport terminal, wondering whether I
was going to make the connecting flight to Seattle, en-route to
Vancouver for the 9th USENIX
Symposium on Operating Systems Design and
. Suddenly, my cell phone rang. It
was Michael
, a long-time co-worker and mentor. “Have you seen the
email?” “No, I just landed in Newark and am on the way to catch a
connection to Seattle. Which email?” “Here, let me read you the

Dear Authors,

Your paper has been selected as one of two
winners of the OSDI Jay
Best Paper award.”

Receiving this award is a unique experience and a great honor. It is
doubly sweet because of all the research projects I’ve worked on, the
Turtles nested virtualization project is perhaps the one I am most
proud of. When Orit, Ben, and I started working on it in 2008, we set
out to do the impossible. Many colleagues claimed that efficient
nested x86 virtualization on the Intel platform could not be
done. Eventually, working long and hard, and with help from friends,
we showed that not only could it be done, it even performs well. I’ve
learned a lot in the process, about x86 virtualization, about leading
a team, and about the art and craft doing research, but the most
important lesson was to never lose hope, to always believe that
eventually, it will work. And guess what? It did!

If you want to know how we did it, and what we learned in the process,
check out The Turtles
Project: Design and Implementation of Nested Virtualization

In classical machine virtualization, a hypervisor runs multiple
operating systems simultaneously, each on its own virtual machine. In
nested virtualization a hypervisor can run multiple other
hypervisors with their associated virtual machines. As operating
systems gain hypervisor functionality—Microsoft Windows 7 already
runs Windows XP in a virtual machine—nested virtualization will
become necessary in hypervisors that wish to host them. We present the
design, implementation, analysis, and evaluation of high-performance
nested virtualization on Intel x86-based systems. The Turtles project,
which is part of the Linux/KVM hypervisor, runs multiple
unmodified hypervisors (e.g., KVM and VMware) and operating
systems (e.g., Linux and Windows). Despite the lack of architectural
support for nested virtualization in the x86 architecture, it can
achieve performance that is within 6-8\% of single-level (non-nested)
virtualization for common workloads, through multi-dimensional
for MMU virtualization and multi-level device
for I/O virtualization.

The scientist gave a superior smile before replying, “What
is the tortoise standing on?” “You’re very clever, young man, very
clever”, said the old lady. “But it’s turtles
all the way down!

September 6, 2010

Happy one month birthday, Ze’ev

Filed under: Uncategorized — Muli Ben-Yehuda @ 1:17 PM


recent activity in a capsule

Filed under: Uncategorized — Muli Ben-Yehuda @ 1:15 PM

November 17, 2009

interesting call for papers

Filed under: Uncategorized — Muli Ben-Yehuda @ 9:38 PM

I have been remiss in updating this thing recently. In penance, I
offer you these interesting call for papers from conferences that you
should, without a doubt, submit your best papers to:

The 2nd
Workshop on I/O Virtualization
, which I will be co-chairing, will
be co-located with ASPLOS 2010 and VEE 2010 in
Pittsburgh, Pennsylvania, in March 2010. Once again we will be looking for
and thought-provoking papers in I/O virtualization
although if your paper is only ground-breaking or only thought
provoking, that’s fine too.

The 24th
International Conference on Supercomputing (ICS’10)
will be held
in Japan (Japan!) in June 2010. We are soliciting papers on all
aspects of research, development, and application of high-performance
experimental and commercial systems. This will be my first time on the
ICS PC, and I am looking forward to the experience.

Last but certainly not least, SYSTOR
2010—The 3rd Annual Haifa Experimental Systems Conference
, will
be held once again in Haifa in May, 2010, and you should all come

More later.

April 7, 2009

SYSTOR 2009 Call for Participation

Filed under: Uncategorized — Muli Ben-Yehuda @ 4:01 PM
                   CALL FOR PARTICIPATION

    SYSTOR 2009---The Israeli Experimental Systems Conference
                        4-6 May 2009
                        Haifa, Israel

Registration deadline: May 2nd

SYSTOR 2009, the Israeli Experimental Systems Conference, will be held
at IBM Haifa Labs, in Haifa, Israel. The conference program will run
over three days, combining the forefront of academic systems research
with real-world systems developed in industry. The goal of the
conference is to promote systems research and to foster stronger ties
between the Israeli and worldwide systems research communities and
industry. Conference proceedings will be published by ACM in the ACM
Digital Library.

There is a limited number of seats available on a
first-come-first-served basis upon registration at
(registration is free of charge). Lunch and refreshments will be
served on all three days courtesy of IBM Haifa Labs.

The first day of the conference will feature sessions on distributed
systems, concurrency, and power management. Marc Snir, University of
Illinois at Urbana Champaign, will give a keynote talk, and in the
afternoon a student poster session with sweet refreshments will be

The second day will begin with the keynote "Towards Invisible Storage"
by Alain Azagury, Director, XIV Business Executive, IBM, and an
invited talk on "The Next Generation Data Center" by Michael Kagan,
Mellanox CTO. After the morning talks, there will be paper sessions
focusing on data de-duplication and storage issues. The day will end
with an optional social event in Caesarea.

The third day will conclude the conference with paper sessions on
virtualization and system optimizations, and a panel of well-known
systems researchers who will debate "What is Systems Research about
and is it Relevant?" The full program for all three days is available
on the conference website.

We look forward to seeing you at SYSTOR 2009!

SYSTOR Advisory Committee
    * Marc Auslander, IBM
    * Ken Birman, Cornell
    * Danny Dolev, HUJI
    * Julian Satran, IBM
    * Marc Snir, UIUC
    * Willy Zwaenepoel, EPFL

Program Chairs
    * Michael Factor, IBM
    * Dror Feitelson, HUJI

General Chair
    * Miriam Allalouf, IBM

Publicity Chair
    * Muli Ben Yehuda, IBM

Publication Chair
    * Gregory Chockler, IBM

April 5, 2009


Filed under: Uncategorized — Muli Ben-Yehuda @ 12:11 AM

I want to update this thing more often, but there’s so much going on, the days filled with action and counter-action, that before I know it it’s past midnight, and I have to wake up at 5 AM for a workout, and updating the blog is left on the TODO list for yet another day. Like, today.

So, content?

I’ve been a manager for a month and change now, managing the virtualization and systems architecture group at the lab. It’s an interesting challenge (which is why I agreed to do it), often frustrating, occasionally exhilarating. To my surprise, the part I like most is dealing with human beings in their myriad forms. To my non-surprise, the part I like least is the bureaucracy, but I figured I’d wait a couple more months before I start tilting at wind-mills. I still write code (well, debug code, mostly) and conduct research, but it’s no longer the most important part of my day.

On the research front, we had two papers accepted to ICAC 2009 (one full paper and one short paper/poster), both in the general area of treating virtual machines as black boxes and inferring useful things about them—performance bottlenecks and boot-time–via statistical analysis of their inputs and outputs. Another paper, on the DMA mapping problem in direct assignment, was not accepted to USENIX ATC to my disappointment, and we are now revising it while looking for a new home.

I am continuing to work out twice a week with a private trainer who is seriously kicking my butt. It’s rare when I don’t finish a workout on the brink of exhaustion, drenched in sweat. I *love* it. Twice a week is no longer enough—I crave the endorphin rushes and sore muscles—so I’ve also re-started going for long walks, and hitting the punching bag in the back-yard like I really mean it. The kilograms are coming off, too, an added bonus.

Last but not least, SYSTOR 2009 is coming up next month, with a great program combining academic research and real-world systems. See y’all there!

« Previous PageNext Page »

Blog at