speedups output files Re: [FLASH-USERS] Details of speedup after restart
Nathan Hearn
nhearn at uchicago.edu
Wed Jun 6 04:37:07 CDT 2007
Hi Sanjib,
From the main email thread, it looks like you are on your way to a
solution towards the speedup issue. (I like Artur's explanation, as
blocks are probably redistributed only during AMR refine/derefine
steps. However, it is still unclear to me why you only occasionally
see the speedup during restart.)
Regarding the nature of the restarts, I believe that flash.dat
merely stores the output stream for diagnostic data generated during
the run. Unless there are some specific input files required by the
Flash modules in use, the only files needed for restart are flash.par
and the checkpoint file. The checkpoint file contains all the
information necessary to reconstruct the mesh.
Changing the resolution during a restart is a somewhat complicated
issue. By design, Flash uses the structural information stored in the
checkpoint file to build the mesh in memory, and there is no
re-meshing capability included. However, it should be possible to
alter the lrefine settings in flash.par to force Flash to change the
minimum and maximum levels of refinement after the checkpoint data is
loaded. (This has been a recent topic of discussion here.)
Right now, more significant changes to the mesh -- such as
altering the physical size of the domain, changing the arrangement of
base blocks, or changing the number of zones per block -- is not
permitted. (I have been working on routines for resampling Flash data
files during the init_block stage, but they are still in development.)
- Nathan
On 6/5/07, sanjib gupta <guptasanjib at lanl.gov> wrote:
> Hi Nathan,
>
> The output files - the HDf5 check and plot files are the same size as
> before restarts.....
> and seem fine - they pickup exactly where the run left off, whether it is
> the thermodynamic conditions or mass fractions, and the resulting
> burning looks perfectly reasonable...
> I am not sure what you mean by binary-equivalent.....
>
> background operations are usually carried out on a different queue on
> the cluster......the cluster top command
> btop, just lets me know which nodes are free and which I am using....
> and it shows usage at 99 % or so, meaning I have all
> CPU usage on the dual-processor nodes......this however could change
> during maintenance hours like 1-3 am when I am not around to
> monitor usage......sometimes the number of total timesteps at the end of
> the night does not make sense (too few), but this has only been a couple
> of times....
> and I have not combed the usually voluminous logfiles unless something
> really goes wrong.
>
> However, I don't think background processes are responsible for the slow
> first run - it is too consistent , and we get maintenance messages from
> the Cluster operators.....
> I would be aware of them.
> Also the CPU usage - initially when this happened I remember doing a lot
> of checks on the allotted processors, I was getting near 100%, before
> and after the restart.
>
> The restart is from "flash.dat" ? How does it work? I was curious
> anyway, since it would be nice to do things like.....change the
> resolution at a restart. If only the thermo conditions etc. are
> transported to the
> restarted run, and properly rezoned, one would not have to worry? Unless
> of course dynamic structures have developed based on the earlier zoning
> ...but for relatively quiescent scenarios that are being restarted this
> should approximately work?
>
> Thanks
> Sanjib
--
Nathan C. Hearn
nhearn at uchicago.edu
ASC Flash Center
Computational Physics Group
University of Chicago
More information about the flash-users
mailing list