From rge21 at astro.su.se Wed Feb 7 16:07:23 2007 From: rge21 at astro.su.se (Richard Edgar) Date: Wed, 7 Feb 2007 23:07:23 +0100 Subject: [FLASH-USERS] Crashy restarts Message-ID: <20070207220722.GA10060@astro.su.se> Greetings all, I am trying to restart some of my Flash (v2.5 using the Intel 9.0 compiler) runs, and I'm encountering unexplained crashes. The restart runs crash before they reach the same evolution time as the run from which they restarted. e.g.: Run a calculation for 1000 timesteps. As well as the initial and final checkpoints, there's one after 500 timesteps (say) Try to restart the code Code dies after 30-40 timesteps This happens whether I restart after the initial checkpoint, or the n=500 one. Are checkpoints supposed to give exactly the same result (up to the last few bits, perhaps), or is there a difference between a restarted run, and one just left going. Looking at the code, calculation of the timestep might be slightly affected, but I don't see that that should be significant. TIA, Richard Edgar -- Richard Edgar http://www.astro.su.se/~rge21/ Stockholm Observatory From dachrist at uiuc.edu Thu Feb 8 20:04:07 2007 From: dachrist at uiuc.edu (Duncan Christie) Date: Thu, 8 Feb 2007 20:04:07 -0600 Subject: [FLASH-USERS] MPI Question Message-ID: <39c6dfc581d740ae733927c475b7cfcb@uiuc.edu> Hi, I have been trying to get FLASH2.5 to run on a multi-node/multi-processor cluster. To start, I tried running one of the test problems (sedov-2d) just to make sure everything worked, and it worked fine. When I tried to get my code working (MHD w/ self gravity in 3d), FLASH only seemed to use 1 processor. The log file reports that 16 processors are being used; however, it doesn't seem to be working. The reason I believe it is only using only one processor: 1. I had it do a simply output for each processor (a simple "Hi, my processor id is..." in the init_block.F90). I only get output from the first node. 2. If I run with more than MAXBLOCKS, I get an error saying it cannot add another block. If I understand it correctly, MAXBLOCKS is the maximum number of blocks for a single processor. 3. It isn't running any faster than on a single processor :-) I don't know what could have been done. The only files I modified are init_block.F90 and user_bnd.F90, both of which don't seem to deal with the MPI. Does anybody have a suggestion what could be going wrong? Duncan From dubey at flash.uchicago.edu Fri Feb 9 07:25:54 2007 From: dubey at flash.uchicago.edu (Anshu Dubey) Date: Fri, 9 Feb 2007 07:25:54 -0600 (CST) Subject: [FLASH-USERS] Crashy restarts In-Reply-To: <20070207220722.GA10060@astro.su.se> References: <20070207220722.GA10060@astro.su.se> Message-ID: <61207.75.3.87.135.1171027554.squirrel@flash.uchicago.edu> > Greetings all, > > I am trying to restart some of my Flash (v2.5 using the Intel 9.0 > compiler) runs, and I'm encountering unexplained crashes. The restart runs > crash before they reach the same evolution time as the run from which they > restarted. e.g.: > > Run a calculation for 1000 timesteps. As well as the initial and final > checkpoints, there's one after 500 timesteps (say) > > Try to restart the code > > Code dies after 30-40 timesteps > > This happens whether I restart after the initial checkpoint, or the n=500 > one. Are checkpoints supposed to give exactly the same result (up to the > last few bits, perhaps), or is there a difference between a restarted run, > and one just left going. Looking at the code, calculation of the timestep > might be slightly affected, but I don't see that that should be > significant. If you are using the code with conserved_var=.false. (the default mode) the checkpoints should give exactly the same result. If conserved_var=.true., there is an extra division and multiplication introduced by the conversion of the variables, and that causes the number to change a little bit, which if your problem is very non-linear, can grow with time. But none of that explains the death from restart. You could try dumping a few checkpoints in between so that you get at least one from the restarted run to compare against a "from scratch" run. Anshu From s0675710 at sms.ed.ac.uk Thu Feb 22 15:44:14 2007 From: s0675710 at sms.ed.ac.uk (CS Daley) Date: Thu, 22 Feb 2007 21:44:14 +0000 Subject: [FLASH-USERS] Documentation question Message-ID: <20070222214414.ij8cz4yqhwsko0o8@www.sms.ed.ac.uk> Dear Flash users, My name is Chris Daley and I am an MSc student at the University of Edinburgh. I am investigating ways to optimise the run time of two subroutines in the FLASH code for my MSc dissertation. The subroutines that we wish to optimise are MapParticlesToMesh.F90 and ReDistributeParticles.F90. We are using version 2.5 of the FLASH source code. We have found useful information in section 13 of the user-guide about the Particles module. However, I am wondering if anyone can provide me with additional information about MapParticlesToMesh.F90 and ReDistributeParticles.F90? Is there any documentation available which provide a description of these two subroutines? I am also very interested in how these subroutines have changed in version 3.0 of the FLASH code. The FLASH 3.0 alpha release contains a skeleton subroutine named Grid_moveParticles.F90, which seems to perform the same task as ReDistributeParticles.F90 in version 2.5. Is there any documentation or code information describing Grid_moveParticles.F90? I am very grateful for any feedback. Kind Regards, Chris From dubey at flash.uchicago.edu Thu Feb 22 16:05:55 2007 From: dubey at flash.uchicago.edu (Anshu Dubey) Date: Thu, 22 Feb 2007 16:05:55 -0600 (CST) Subject: [FLASH-USERS] Documentation question In-Reply-To: <20070222214414.ij8cz4yqhwsko0o8@www.sms.ed.ac.uk> References: <20070222214414.ij8cz4yqhwsko0o8@www.sms.ed.ac.uk> Message-ID: <61598.75.3.140.8.1172181955.squirrel@flash.uchicago.edu> Dear Chris, There is a completely new algorithm for redistribution of particles in FLASH3 which will be included in the Beta release at the end of this month. We have also worked out the skeleton of the mapParticlesToMesh algorithm, but that is not yet implemented. Anshu > Dear Flash users, > > My name is Chris Daley and I am an MSc student at the University of > Edinburgh. > > I am investigating ways to optimise the run time of two subroutines in > the FLASH code for my MSc dissertation. > > The subroutines that we wish to optimise are MapParticlesToMesh.F90 > and ReDistributeParticles.F90. We are using version 2.5 of the FLASH > source code. > > We have found useful information in section 13 of the user-guide about > the Particles module. However, I am wondering if anyone can provide me > with additional information about MapParticlesToMesh.F90 and > ReDistributeParticles.F90? Is there any documentation available which > provide a description of these two subroutines? > > I am also very interested in how these subroutines have changed in > version 3.0 of the FLASH code. The FLASH 3.0 alpha release contains a > skeleton subroutine named Grid_moveParticles.F90, which seems to > perform the same task as ReDistributeParticles.F90 in version 2.5. Is > there any documentation or code information describing > Grid_moveParticles.F90? > > I am very grateful for any feedback. > > Kind Regards, > Chris > > From dachrist at uiuc.edu Mon Feb 26 18:10:09 2007 From: dachrist at uiuc.edu (Duncan Christie) Date: Mon, 26 Feb 2007 18:10:09 -0600 Subject: [FLASH-USERS] Next Release of FLASH Message-ID: <4d0270898def19e72ed857227c79af52@uiuc.edu> Out of curiosity, when is the next release of FLASH (beta?) coming out? From dubey at flash.uchicago.edu Mon Feb 26 19:42:32 2007 From: dubey at flash.uchicago.edu (Anshu Dubey) Date: Mon, 26 Feb 2007 19:42:32 -0600 (CST) Subject: [FLASH-USERS] Next Release of FLASH In-Reply-To: <4d0270898def19e72ed857227c79af52@uiuc.edu> References: <4d0270898def19e72ed857227c79af52@uiuc.edu> Message-ID: <55077.75.3.138.85.1172540552.squirrel@flash.uchicago.edu> This week. > Out of curiosity, when is the next release of FLASH (beta?) coming out? > From rge21 at astro.su.se Wed Feb 7 16:07:23 2007 From: rge21 at astro.su.se (Richard Edgar) Date: Wed, 7 Feb 2007 23:07:23 +0100 Subject: [FLASH-USERS] Crashy restarts Message-ID: <20070207220722.GA10060@astro.su.se> Greetings all, I am trying to restart some of my Flash (v2.5 using the Intel 9.0 compiler) runs, and I'm encountering unexplained crashes. The restart runs crash before they reach the same evolution time as the run from which they restarted. e.g.: Run a calculation for 1000 timesteps. As well as the initial and final checkpoints, there's one after 500 timesteps (say) Try to restart the code Code dies after 30-40 timesteps This happens whether I restart after the initial checkpoint, or the n=500 one. Are checkpoints supposed to give exactly the same result (up to the last few bits, perhaps), or is there a difference between a restarted run, and one just left going. Looking at the code, calculation of the timestep might be slightly affected, but I don't see that that should be significant. TIA, Richard Edgar -- Richard Edgar http://www.astro.su.se/~rge21/ Stockholm Observatory From dachrist at uiuc.edu Thu Feb 8 20:04:07 2007 From: dachrist at uiuc.edu (Duncan Christie) Date: Thu, 8 Feb 2007 20:04:07 -0600 Subject: [FLASH-USERS] MPI Question Message-ID: <39c6dfc581d740ae733927c475b7cfcb@uiuc.edu> Hi, I have been trying to get FLASH2.5 to run on a multi-node/multi-processor cluster. To start, I tried running one of the test problems (sedov-2d) just to make sure everything worked, and it worked fine. When I tried to get my code working (MHD w/ self gravity in 3d), FLASH only seemed to use 1 processor. The log file reports that 16 processors are being used; however, it doesn't seem to be working. The reason I believe it is only using only one processor: 1. I had it do a simply output for each processor (a simple "Hi, my processor id is..." in the init_block.F90). I only get output from the first node. 2. If I run with more than MAXBLOCKS, I get an error saying it cannot add another block. If I understand it correctly, MAXBLOCKS is the maximum number of blocks for a single processor. 3. It isn't running any faster than on a single processor :-) I don't know what could have been done. The only files I modified are init_block.F90 and user_bnd.F90, both of which don't seem to deal with the MPI. Does anybody have a suggestion what could be going wrong? Duncan From dubey at flash.uchicago.edu Fri Feb 9 07:25:54 2007 From: dubey at flash.uchicago.edu (Anshu Dubey) Date: Fri, 9 Feb 2007 07:25:54 -0600 (CST) Subject: [FLASH-USERS] Crashy restarts In-Reply-To: <20070207220722.GA10060@astro.su.se> References: <20070207220722.GA10060@astro.su.se> Message-ID: <61207.75.3.87.135.1171027554.squirrel@flash.uchicago.edu> > Greetings all, > > I am trying to restart some of my Flash (v2.5 using the Intel 9.0 > compiler) runs, and I'm encountering unexplained crashes. The restart runs > crash before they reach the same evolution time as the run from which they > restarted. e.g.: > > Run a calculation for 1000 timesteps. As well as the initial and final > checkpoints, there's one after 500 timesteps (say) > > Try to restart the code > > Code dies after 30-40 timesteps > > This happens whether I restart after the initial checkpoint, or the n=500 > one. Are checkpoints supposed to give exactly the same result (up to the > last few bits, perhaps), or is there a difference between a restarted run, > and one just left going. Looking at the code, calculation of the timestep > might be slightly affected, but I don't see that that should be > significant. If you are using the code with conserved_var=.false. (the default mode) the checkpoints should give exactly the same result. If conserved_var=.true., there is an extra division and multiplication introduced by the conversion of the variables, and that causes the number to change a little bit, which if your problem is very non-linear, can grow with time. But none of that explains the death from restart. You could try dumping a few checkpoints in between so that you get at least one from the restarted run to compare against a "from scratch" run. Anshu From s0675710 at sms.ed.ac.uk Thu Feb 22 15:44:14 2007 From: s0675710 at sms.ed.ac.uk (CS Daley) Date: Thu, 22 Feb 2007 21:44:14 +0000 Subject: [FLASH-USERS] Documentation question Message-ID: <20070222214414.ij8cz4yqhwsko0o8@www.sms.ed.ac.uk> Dear Flash users, My name is Chris Daley and I am an MSc student at the University of Edinburgh. I am investigating ways to optimise the run time of two subroutines in the FLASH code for my MSc dissertation. The subroutines that we wish to optimise are MapParticlesToMesh.F90 and ReDistributeParticles.F90. We are using version 2.5 of the FLASH source code. We have found useful information in section 13 of the user-guide about the Particles module. However, I am wondering if anyone can provide me with additional information about MapParticlesToMesh.F90 and ReDistributeParticles.F90? Is there any documentation available which provide a description of these two subroutines? I am also very interested in how these subroutines have changed in version 3.0 of the FLASH code. The FLASH 3.0 alpha release contains a skeleton subroutine named Grid_moveParticles.F90, which seems to perform the same task as ReDistributeParticles.F90 in version 2.5. Is there any documentation or code information describing Grid_moveParticles.F90? I am very grateful for any feedback. Kind Regards, Chris From dubey at flash.uchicago.edu Thu Feb 22 16:05:55 2007 From: dubey at flash.uchicago.edu (Anshu Dubey) Date: Thu, 22 Feb 2007 16:05:55 -0600 (CST) Subject: [FLASH-USERS] Documentation question In-Reply-To: <20070222214414.ij8cz4yqhwsko0o8@www.sms.ed.ac.uk> References: <20070222214414.ij8cz4yqhwsko0o8@www.sms.ed.ac.uk> Message-ID: <61598.75.3.140.8.1172181955.squirrel@flash.uchicago.edu> Dear Chris, There is a completely new algorithm for redistribution of particles in FLASH3 which will be included in the Beta release at the end of this month. We have also worked out the skeleton of the mapParticlesToMesh algorithm, but that is not yet implemented. Anshu > Dear Flash users, > > My name is Chris Daley and I am an MSc student at the University of > Edinburgh. > > I am investigating ways to optimise the run time of two subroutines in > the FLASH code for my MSc dissertation. > > The subroutines that we wish to optimise are MapParticlesToMesh.F90 > and ReDistributeParticles.F90. We are using version 2.5 of the FLASH > source code. > > We have found useful information in section 13 of the user-guide about > the Particles module. However, I am wondering if anyone can provide me > with additional information about MapParticlesToMesh.F90 and > ReDistributeParticles.F90? Is there any documentation available which > provide a description of these two subroutines? > > I am also very interested in how these subroutines have changed in > version 3.0 of the FLASH code. The FLASH 3.0 alpha release contains a > skeleton subroutine named Grid_moveParticles.F90, which seems to > perform the same task as ReDistributeParticles.F90 in version 2.5. Is > there any documentation or code information describing > Grid_moveParticles.F90? > > I am very grateful for any feedback. > > Kind Regards, > Chris > > From dachrist at uiuc.edu Mon Feb 26 18:10:09 2007 From: dachrist at uiuc.edu (Duncan Christie) Date: Mon, 26 Feb 2007 18:10:09 -0600 Subject: [FLASH-USERS] Next Release of FLASH Message-ID: <4d0270898def19e72ed857227c79af52@uiuc.edu> Out of curiosity, when is the next release of FLASH (beta?) coming out? From dubey at flash.uchicago.edu Mon Feb 26 19:42:32 2007 From: dubey at flash.uchicago.edu (Anshu Dubey) Date: Mon, 26 Feb 2007 19:42:32 -0600 (CST) Subject: [FLASH-USERS] Next Release of FLASH In-Reply-To: <4d0270898def19e72ed857227c79af52@uiuc.edu> References: <4d0270898def19e72ed857227c79af52@uiuc.edu> Message-ID: <55077.75.3.138.85.1172540552.squirrel@flash.uchicago.edu> This week. > Out of curiosity, when is the next release of FLASH (beta?) coming out? >