Basic Profiling
You can use basic profiling to determine where the time is spent in evaluating the model equations, evaluating the steps in the simulation.
To be able to use any profiling, precision timing is needed. This can be accomplished setting the flag:
Advanced.Define.PrecisionTiming=true
This functionality uses the frequency counter of the processor to provide high-resolution time measurements.
To turn on basic profiling, set the flag:
Advanced.GenerateBlockTimers=true
Then translate the model, and run the simulation. At the end of the log file a table contains the profiling information.
We exemplify the usage of block timers by simulating the model Modelica.Mechanics.MultiBody.Examples.Loops.EngineV6_analytic (which takes about 0.1 second to simulate). Below follows a timing of such a simulation.
Profiling information for the blocks. Estimated overhead per call 0.01[us] total 0.001[s] the estimated overhead has been subtracted below. Name of block , Block, Total CPU[s], Mean[us] ( Min[us] to Max[us] ), Called Entire model : 0, 0.139, 10.88 ( 0.53 to 464.91), 12740 Event handling : 1, 0.003, 13.59 ( 11.18 to 49.33), 252 Empty timer : 2, 0.000, 0.01 ( 0.00 to 0.05), 12744 Outside of model : 3, 0.034, 2.68 ( 0.07 to 2140.39), 12742 Parameters : 4, 0.000, 203.73 ( 203.73 to 203.73), 1 Parameter code : 5, 0.000, 197.46 ( 197.46 to 197.46), 1 Parameter code : 6, 0.000, 6.12 ( 6.12 to 6.12), 1 InitialSection : 7, 0.000, 292.15 ( 292.15 to 292.15), 1 Initial code : 8, 0.000, 232.99 ( 232.99 to 232.99), 1 initialization.linear[1]: 9, 0.000, 57.24 ( 57.24 to 57.24), 1 Initial code : 10, 0.000, 0.07 ( 0.07 to 0.07), 1 OutputSection : 11, 0.123, 10.26 ( 8.29 to 463.70), 11990 Output code : 12, 0.044, 3.67 ( 2.92 to 265.53), 11990 simulation.linear[1] : 13, 0.078, 6.46 ( 5.07 to 458.66), 11990 Output code : 14, 0.000, 0.02 ( 0.01 to 0.73), 11990 DynamicsSection : 15, 0.001, 0.06 ( 0.03 to 16.61), 11991 Dynamics code : 16, 0.000, 0.01 ( 0.00 to 0.12), 11991 AcceptedSection2 : 17, 0.007, 1.62 ( 1.21 to 28.02), 4337 Auxiliary2 code : 18, 0.007, 1.58 ( 1.15 to 27.94), 4337
The first lines state that we have estimated the overhead of the timers to 0.01 microseconds and subtracted them from all timing estimates, thus making it easier to find the problematic block. The total overhead is also included for comparison.
It is then necessary to examine the file dsmodel.c in order to determine what block number corresponds to what variable(s). The start of block 6 is marked by DymolaStartTimer(6) and the end by DymolaEndTimerName(6). The first blocks are special:
Block # | Task |
---|---|
0 | The total time for all model evaluations. |
1 | The total time for model evaluations during event handling. |
2 | An empty block, included just for comparison. |
3 | The total time spent between model evaluations. |
For each of these blocks we have only subtracted the overhead for its own timer and not for additional timer calls that occur during its call. Thus the total time for all model evaluations should more accurately be around 0.139 s - 0.001s (total overhead) = 0.138 s.
In this example the event iterations are insignificant compared to the rest, even though there are 125 event iterations.
The remaining blocks are code-blocks, either system of equations (including a trailing part of torn equations) or blocks of non-equations.
Among these blocks the most time is spent in block 13. This linear system of equations accounts for about 0.08 s of the models total 0.14 s. It represents the global kinematic loop that is defined by the six pistons through their connection to the crankshaft. However, note that each cylinder in the EngineV6_analytic model contains a loop breaker joint that analytically solves all kinematic loops. Thus, simulation.linear[1] is entirely solved during translation. During simulation it only consists of assignment statements (as can be seen under Statistics in the Translation log, where the equation´s size is listed as zero).
Note that block 8 has the largest average time of the normal blocks, but it has no significant influence on the total CPU-time. This block consists of variables that are parameter-dependent and thus are constant during continuous integration, and is thus evaluated only once.
Remember to reset the flag:
Advanced.GenerateBlockTimers=false
in order to avoid the overhead for the next model. This flag is not cleared by Clear or Clear All commands.
The timer overhead also affects the total simulation time. Also note that the accurate timers measure wall-clock time and not CPU-time in the simulation process. Thus simple statements can have large maximum times if another process interrupts them between the start of the timer and the stop of the timer. This should not affect the average and minimum times.