Thanks Steve! I tried using the run.py in configs/splash2 directory but I
keep getting segmentation fault. I am running FFT benchmark (I downloaded
the binaries from the link provided at M5 wiki's splash benchmark's page). I
made following changes in my python script (run.py):
................................
class FFT(LiveProcess):
cwd = options.rootdir + '/kernels/fft'
executable = options.rootdir + '/kernels/fft/FFT'
cmd = ['FFT', '-p', str(*options.numcpus*options.smt*), '-m18']
..............................
..............................
cpus = [DerivO3CPU(cpu_id=i,
clock=options.frequency,
*numThreads=options.smt,
smtNumFetchingThreads=options.smt,
smtFetchPolicy='RoundRobin'*)
for i in xrange(options.numcpus)]
So, if I understood what Steve said in last mail then I think I am running
it correctly. I tried debugging but it didn't help. I am using a latest
version of M5 which I downloaded from the repositories about a week ago.
Also when I used smtNumFetchingThreads=1 and numThreads=1 then it works!!
This is what I got after running gdb:
: /af20/ar8eb/Desktop/CS8501/m5-ad784e759a74 ; gdb build/ALPHA_SE/m5.debug
GNU gdb 6.8-debian
................................................................
This GDB was configured as "x86_64-linux-gnu"...
(gdb) run configs/splash2/run_test.py -n 1 --smt=2
--rootdir=../m5-stable-94c016415053/v1-splash-alpha/splash2/codes/ -d -b FFT
....................................................................
command line:
/net/af20/ar8eb/Desktop/CS8501/m5-ad784e759a74/build/ALPHA_SE/m5.debug
configs/splash2/run_test.py -n 1 --smt=2
--rootdir=../m5-stable-94c016415053/v1-splash-alpha/splash2/codes/ -d -b FFT
Global frequency set at 1000000000000 ticks per second
0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f6d50dc76f0 (LWP 18930)]
0x00000000004522a0 in std::vector<int, std::allocator<int> >::push_back
(this=0x40, __x=@0x7fff58dea6b4) at
/usr/include/c++/4.2/bits/stl_vector.h:599
599 if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
Thanks,
Abhishek
Post by Steve ReinhardtIf the benchmark is explicitly multithreaded and compiled against the
m5threads library, then you can run it on an SMT O3 CPU and it should
work.
Steve
Post by abhishek rawatThanks for the reply. I have a follow up question. So, lets say I am only
using one O3 core and If I want to run 2-threaded SMT that means I will
have
Post by abhishek rawatto provide as input for instance 2 FFT benchmarks. And for 4-threaded
SMT
Post by abhishek rawatsimilarly I will have to provide as input 4 FFT benchmark. In this case
how
Post by abhishek rawatcan I do the comparison as the benchmarks have changed, in size. So my
question was can I just provide lets say 1 FFT benchmark as input and
then
Post by abhishek rawattry to run it with 2 threaded-SMT and then with 4 threaded-SMT
configuration. And I think you are saying it is not possible.
Thanks,
--Abhishek
Post by Korey SewellPost by abhishek rawatI have this doubt related to O3 SMT model in ALPHA_SE mode. So I
understand that for running multi-programmed benchmarks one can pass a
vector of benchmarks as the parameters and set the cpu.numThreads to
the
Post by abhishek rawatPost by Korey SewellPost by abhishek rawatsize of vector. But is it possible for the O3 CPU to extract the TLP in
a
Post by abhishek rawatPost by Korey SewellPost by abhishek rawat(single) given multithreaded benchmark. So what I am trying to say is
lets
Post by abhishek rawatPost by Korey SewellPost by abhishek rawatsay I just want to run a single multi-threaded benchmark and just by
specifying the number of SMT threads (say 4 SMT threads) can the O3 CPU
extract the threads from the benchmark and run in a SMT fashion? I
would
Post by abhishek rawatPost by Korey SewellPost by abhishek rawatappreciate any directions in this regard.
As far as I know, O3CPU will only run the threads that you give it, so
your actual benchmark needs to specify how many threads are going to be
run
Post by abhishek rawatPost by Korey Sewellin parallel in order for you to see any benefit in SE_SMT. The problem
of
Post by abhishek rawatPost by Korey Sewellautomatically extracting TLP is in fact it's own subset of the research
community, so if you wish, I'm sure you can find plenty of literature on
that topic.
Post by abhishek rawatSomething tells me that it is not possible. And if that is the case
then
Post by abhishek rawatPost by Korey SewellPost by abhishek rawathow do I compare different SMT configurations for a given benchmark ?
That's a bit of a abstract question and on the M5-side, I'm not sure
there
Post by abhishek rawatPost by Korey Sewellis a definite *right* answer here just user-dependent ones. For some
people,
Post by abhishek rawatPost by Korey Sewellcomparing a 2-threaded SMT core v. a 4-threaded SMT core is fine. For
others, comparing a 2T-SMT core vs. 2 CMP cores is their objective.
Considering a configuration to be the benchmark running and the hardware
that it is to be run on, It's up to the user to specify different M5
configurations that are interesting to them, run M5 for different
configurations, figure out what configurations are comparable to each
other,
Post by abhishek rawatPost by Korey Sewelland then decide what to do next with their results.
--
- Korey
_______________________________________________
m5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
--
-Abhi
_______________________________________________
m5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
--
-Abhi