This is the system we are going to use as our example. We have M clients
connected to some external network. These send requests to the load balancer in
the internal network. Their requests are then forwarded and served by one of N
servers with X hardware accelerators each. Here, M, N, and X are system-level
design parameters. The external network and clients are fixed but for the
internal network we can freely choose the topology, link speeds, etc.
We also have component-level choices like the number of cores and amount of
memory available at servers, or architectural parameters of our hardware
accelerator, for example clock-speed and the dimensions of the internal compute
array.
To capture more realism, we are also going to add background traffic to the
internal network. We parameterize it in its traffic volume as the percentage of
theoretical max throughput.
# Use the Full Python Machinery to Build SimBricks Experiments!
In a [prior blog post](https://www.simbricks.io/blog/orchestration_framework.html), Hejing illustrates how to easily cast a system design into an experiment in the SimBricks orchestration framework. TL;DR: You just instantiate a few classes. However, there’s no restriction here that forces you to just instantiate one experiment per Python module. Instead, we can construct one for every combination of parameters that we want to evaluate. Since this is Python and we are just instantiating classes, you can use your favorite Python constructs to do so! I decided to go for `itertools.product()` and and a few simple for-loops:
This is the system we are going to use as our working example. We have M clients
connected to an external network. Both are given by the customer and can't be
changed. The clients send requests to the load balancer in the internal network,
which then forwards them to one of N servers with X hardware accelerators each.
Here, N and X are system-level design parameters the system architect can play
with. They can also freely choose what the internal network looks like in terms
of topology, link speeds, etc. Further, we have component-level parameters like
the number of cores and amount of memory available at servers, and architectural
choices for the hardware accelerators like clock-speed and the dimensions of
their inner compute grid. Realistically, both networks are also going to have
background traffic.
Even for this rather simple system, we can already ask a bunch of questions that
need evaluation for reliable answers: Given that the customer wants to have M
clients, how many servers N do we need to achieve the service-level objectives,
for example a guaranteed maximum request latency? Can we reduce the number of
servers required by introducing hardware accelerators? How is all this
influenced by background traffic? Can we reduce costs for building the internal
network by prioritizing client-server traffic over background traffic with the
help of smart network switches?
# Let's do some Evaluation with SimBricks!
To simulate the system we just saw with SimBricks, we need a simulator for each
component. You decide the level of detail you need here! For the hardware
accelerator, we can quickly write up a behavioral model in C++, which already
allows us to answer what if questions. But most importantly, we are going to run
the actual software and workloads of our customer to measure the end-to-end
properties we care about.
For building the simulation, you write a Python script for the SimBricks
orchestration framework that describes the system you want to simulate and which
# Fast Design Space Sweeps by Running Experiments in Parallel
In SimBricks, individual experiments can be run independently and thereby in parallel to do fast design space sweeps. Our orchestration framework even automates this for you if you invoke `simbricks-run` with the `--parallel` flag. Parallelizing on the same machine isn’t always possible though. Due to how we establish communication between simulators in the form of shared memory queues, which use active polling for maximum efficiency (learn more about this [here](https://www.simbricks.io/blog/shm-message-passing.html)), no simulator can share a physical thread with another or else simulations become very slow.
To make design space sweeps faster, SimBricks allows you to run experiments in
parallel. Our orchestration framework even automates this if you invoke
`simbricks-run` with the `--parallel` flag. Parallelizing on the same machine
isn’t always possible though. Due to how we establish communication between
simulators with shared memory queues, which use polling for maximum efficiency
mustn't share physical threads or else simulations become very slow.
But even in this case, our orchestration framework offers distributed simulations, where simulations are run on multiple machines in parallel. Stay tuned for more about this! Until then:
But even in this case and to explore even more design choices in parallel, our
orchestration framework offers distributed simulations, where experiments are
run on multiple machines. Stay tuned for more on this! Until then: