Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
simbricks-website
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
SimBricks
simbricks-website
Commits
ec1f98ab
Verified
Commit
ec1f98ab
authored
6 months ago
by
Marvin Meiers
Browse files
Options
Downloads
Patches
Plain Diff
add distributed simulations blog post
parent
ccafe469
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Pipeline
#106319
passed
6 months ago
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
_posts/2024-09-04-distributed-simulations.md
+80
-0
80 additions, 0 deletions
_posts/2024-09-04-distributed-simulations.md
with
80 additions
and
0 deletions
_posts/2024-09-04-distributed-simulations.md
0 → 100644
+
80
−
0
View file @
ec1f98ab
---
title
:
"
Distributed
Simulations
Using
SimBricks"
subtitle
:
|
H
ow does SimBricks help users to scale up their simulations
b
y distributing them across multiple machines?
d
ate: 2024-09-04
a
uthor: marvin
p
ermalink: /blog/distributed-simulations.html
c
ard_image: TODO
-
--
SimBricks allows users to run full-system simulations by running multiple
simulators as
[
separate loosely coupled processes
](
loosely-coupled-simulator-processes.html
)
.
Scaling up full-system simulations in SimBricks is usually as easy as adding
more simulated components, where each new component is simulated by an
additional simulator process. Additionally, SimBricks allows decomposition of
some simulators into multiple instances running as different processes,
preventing them from becoming a bottleneck while scaling up. We will further
explore decomposition of simulators in a future blog post.
This approach naturally parallelizes the simulation and ensures that the
simulation time stays low, but it also requires more resources, especially in
form of physical CPU cores. Since SimBricks adapters are
[
polling shared memory queues
](
shm-message-passing.html
)
, the simulator
processes are always busy, which means that we should not oversubscribe the
available CPU cores. Therefore, the size of a full-system simulation on one host
is limited by its resources, requiring us to distribute the processes across
multiple machines to scale the simulation beyond the limits of a single host. In
the following we will cover how SimBricks uses proxies leveraging network
communication to distribute simulations across multiple machines.
# Scale Up By Using Separate Proxy Processes
SimBricks uses message passing for communication between the different simulator
processes. The communication between two simulator processes on the same host is
implemented by shared memory queues. Scaling out simulations by partitioning
components to multiple hosts can easily be accomplished by replacing the shared
memory queues with network communication.
However, directly implementing this in individual component simulators has two
major drawbacks. First, it increases the complexity for
[
integration
](
integrating-simulators.html
)
, as each simulator adapter needs to
implement an additional message transport. Second, it increases communication
overhead in component simulators, leaving fewer processor cycles for simulators
and increasing simulation time. To avoid these drawbacks, we instead implement
network communication separately in proxies.
SimBricks proxies connect to local component simulators through shared memory
queues in the same way as two simulators would connect and forward messages over
the network to their peer proxy which operates symmetrically. This requires an
additional processor core for the proxy on each side, but is fully transparent
to component simulators and does not increase their communication overhead,
since the simulator adapters stay the same.
At the moment, SimBricks provides two proxy implementations supporting two
protocols for network communication: TCP and RDMA. However, additional proxies
can of course easily be added to support further communication protocols.
SimBricks proxies also implement multiplexing, so that multiple connections of
component simulators between two machines can be handled by the same pair of
proxies. This reduces the number of proxies needed and therefore allows more CPU
cores to be used for simulators.
# Orchestrating Proxies
Simbricks'
[
orchestration framework
](
orchestration_framework.html
)
of course
comes with support to use the proxies and distribute full-system simulations
across multiple machines. The user can create a distributed experiment and add
simulation components just as with a normal non-distributed simulation. Then,
the user adds appropriate proxies as needed to the experiment and finally
assigns the simulation components to the different machines. When starting the
simulation the user provides a JSON file containing information about the
available machines, like the IP address and the working directory. The
orchestration framework then takes care of running all simulators and proxies on
the respective machines using SSH to execute processes on remote machines.
The orchestration framework also includes an example for automatically
distributing an experiment across two hosts, showing that this step can even be
automated.
If you have questions or would like to learn more:
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment