Return to site

Add a Blog Post Title

broken image

 

 

 

*William Stallings Computer Organization

*Computer Architecture And Organization PdfFull file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsCHAPTER 1 OVERVIEW ANSWERS TO QUESTIONS 1.1 Computer architecture refers to those attributes of a system visible to a programmer or, put another way, those attributes that have a direct impact on the logical execution of a program. Computer organization refers to the operational units and their interconnections that realize the architectural specifications. Examples of architectural attributes include the instruction set, the number of bits used to represent various data types (e.g., numbers, characters), I/O mechanisms, and techniques for addressing memory. Organizational attributes include those hardware details transparent to the programmer, such as control signals; interfaces between the computer and peripherals; and the memory technology used. 1.2 Computer structure refers to the way in which the components of a computer are interrelated. Computer function refers to the operation of each individual component as part of the structure. 1.3 Data processing; data storage; data movement; and control. 1.4 Central processing unit (CPU): Controls the operation of the computer and performs its data processing functions; often simply referred to as processor. Main memory: Stores data. I/O: Moves data between the computer and its external environment. System interconnection: Some mechanism that provides for communication among CPU, main memory, and I/O. A common example of system interconnection is by means of a system bus, consisting of a number of conducting wires to which all the other components attach. 1.5 Control unit: Controls the operation of the CPU and hence the computer Arithmetic and logic unit (ALU): Performs the computer’s data processing functions Registers: Provides storage internal to the CPUFull file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsCPU interconnection: Some mechanism that provides for communication among the control unit, ALU, and registersFull file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsCHAPTER 2 COMPUTER EVOLUTION AND PERFORMANCE ANSWERS TO QUESTIONS 2.1In a stored program computer, programs are represented in a form suitable for storing in memory alongside the data. The computer gets its instructions by reading them from memory, and a program can be set or altered by setting the values of a portion of memory. 2.2 A main memory, which stores both data and instructions: an arithmetic and logic unit (ALU) capable of operating on binary data; a control unit, which interprets the instructions in memory and causes them to be executed; and input and output (I/O) equipment operated by the control unit. 2.3 Gates, memory cells, and interconnections among gates and memory cells. 2.4 Moore observed that the number of transistors that could be put on a single chip was doubling every year and correctly predicted that this pace would continue into the near future. 2.5 Similar or identical instruction set: In many cases, the same set of machine instructions is supported on all members of the family. Thus, a program that executes on one machine will also execute on any other. Similar or identical operating system: The same basic operating system is available for all family members. Increasing speed: The rate of instruction execution increases in going from lower to higher family members. Increasing Number of I/O ports: In going from lower to higher family members. Increasing memory size: In going from lower to higher family members. Increasing cost: In going from lower to higher family members. 2.6 In a microprocessor, all of the components of the CPU are on a single chip.Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsFull file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsANSWERS TO PROBLEMS 2.1 a Location 0 1 2 3Instruction/Value <> 1 2 04L 4R 5L 5R 6L 6RLOAD M(0 ADD M(1) MUL M(0) DIV M(2) STOR M(3) JUMP M(6,20:39)b. Location 0 1 2 3Instruction/Value <> 1 1 14L 4R 5L 5R 6L 6R 7L 8R 8L 8RLOAD M(0 SUB M(2) JUMP + M(6,0:19) JUMP + M(5,20:39) LOAD M(2) ADD M(1) STOR M(2) ADD M(3) STOR M(3) JUMP M(4,0:19)Comments Constant (N) [initialized to some value] Constant; Integer value = 1 Constant; Integer value = 2 Variable Y (initialized to integer zero); Sum(Y) N → AC AC + 1 → AC N(N+1) → AC AC/2 → AC AC → Y; saving the Sum in variable Y Done; HALTComments Constant (N) [initialized to some value] Constant (loop counter increment) Variable i (loop index value; current) Variable Y = Sum of X values (Initialized to One) N → AC (the max limit) Compute N–i → AC Check AC > 0 ? [i < N] i=N; done so HALT iFull file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallings2.2 a. Opcode 00000001Operand 000000000010b. First, the CPU must make access memory to fetch the instruction. The instruction contains the address of the data we want to load. During the execute phase accesses memory to load the data value located at that address for a total of two trips to memory. 2.3 To read a value from memory, the CPU puts the address of the value it wants into the MAR. The CPU then asserts the Read control line to memory and places the address on the address bus. Memory places the contents of the memory location passed on the data bus. This data is then transferred to the MBR. To write a value to memory, the CPU puts the address of the value it wants to write into the MAR. The CPU also places the data it wants to write into the MBR. The CPU then asserts the Write control line to memory and places the address on the address bus and the data on the data bus. Memory transfers the data on the data bus into the corresponding memory location. 2.4 Address Contents 08A LOAD M(0FA) STOR M(0FB) 08B LOAD M(0FA) JUMP +M(08D) 08C LOAD –M(0FA) STOR M(0FB) 08D This program will store the absolute value of content at memory location 0FA into memory location 0FB. 2.5 All data paths to/from MBR are 40 bits. All data paths to/from MAR are 12 bits. Paths to/from AC are 40 bits. Paths to/from MQ are 40 bits. 2.6 The purpose is to increase performance. When an address is presented to a memory module, there is some time delay before the read or write operation can be performed. While this is happening, an address can be presented to the other module. For a series of requests for successive words, the maximum rate is doubled.Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallings2.7 The discrepancy can be explained by noting that other system components aside from clock speed make a big difference in overall system speed. In particular, memory systems and advances in I/O processing contribute to the performance ratio. A system is only as fast as its slowest link. In recent years, the bottlenecks have been the performance of memory modules and bus speed. 2.8 As noted in the answer to Problem 2.7, even though the Intel machine may have a faster clock speed (2.4 GHz vs. 1.2 GHz), that does not necessarily mean the system will perform faster. Different systems are not comparable on clock speed. Other factors such as the system components (memory, buses, architecture) and the instruction sets must also be taken into account. A more accurate measure is to run both systems on a benchmark. Benchmark programs exist for certain tasks, such as running office applications, performing floating-point operations, graphics operations, and so on. The systems can be compared to each other on how long they take to complete these tasks. According to Apple Computer, the G4 is comparable or better than a higher-clock speed Pentium on many benchmarks. 2.9 This representation is wasteful because to represent a single decimal digit from 0 through 9 we need to have ten tubes. If we could have an arbitrary number of these tubes ON at the same time, then those same tubes could be treated as binary bits. With ten bits, we can represent 210 patterns, or 1024 patterns. For integers, these patterns could be used to represent the numbers from 0 through 1023. 2.10CPI = 1.55; MIPS rate = 25.8; Execution time = 3.87 ms. Source: [HWAN93]2.11a.Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsCPIA  CPI IiIcMIPS A  CPU A  CPIB i8 1 4  3  2  4  4  3 10 6 8  4  2  4  10 6 2.22f 200 10 6   90 CPIA 10 6 2.22 10 6Ic  CPI A 18 10 6  2.2   0.2 s f 200 10 6 CPIi IiIcMIPS B  CPU B 10 1 8  2  2  4  4  3 10 6 10  8  2  4  10 6 1.92f 200 10 6   104 CPIB 10 6 1.92 10 6Ic  CPI B 24 10 6 1.92   0.23 s f 200 10 6b. Although machine B has a higher MIPS than machine A, it requires a longer CPU time to execute the same set of benchmark programs.  2.12 a. We can express the MIPs rate as: [(MIPS rate)/106] = Ic/T. So that: Ic = T  [(MIPS rate)/106]. The ratio of the instruction count of the RS/6000 to the VAX is [x  18]/[12x  1] = 1.5. b. For the Vax, CPI = (5 MHz)/(1 MIPS) = 5. For the RS/6000, CPI = 25/18 = 1.39. 2.13From Equation (2.2), MIPS = Ic/(T  106) = 100/T. The MIPS values are: Computer AComputer BComputer CProgram 1100105Program 20.115Program 30.20.12Program 410.1251Arithmetic meanRankHarmonic meanRankComputer A25.32510.252Computer B2.830.213Computer C3.2522.11Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallings2.14 a. Normalized to R: BenchmarkProcessor RMZE1.001.713.11F1.001.191.19H1.000.430.49I1.001.110.60K1.002.102.09Arithmetic mean1.001.311.50Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsb. Normalized to M: BenchmarkProcessor RMZE0.591.001.82F0.841.001.00H2.321.001.13I0.901.000.54K0.481.001.00Arithmetic mean1.011.001.10c. Recall that the larger the ratio, the higher the speed. Based on (a) R is the slowest machine, by a significant amount. Based on (b), M is the slowest machine, by a modest amount. d. Normalized to R: BenchmarkProcessor RMZE1.001.713.11F1.001.191.19H1.000.430.49I1.001.110.60K1.002.102.09Geometric mean1.001.151.18Normalized to M: BenchmarkProcessor RMZE0.591.001.82F0.841.001.00H2.321.001.13I0.901.000.54K0.481.001.00Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallingsGeometric mean0.871.001.02Using the geometric mean, R is the slowest no matter which machine is used for normalization. 2.15a.Normalized to X: BenchmarkProcessor XYZ112.00.5210.52.0Arithmetic mean11.251.25Geometric mean111Normalized to Y: BenchmarkProcessor XYZ10.510.2522.014.0Arithmetic mean1.2512.125Geometric mean111Machine Y is twice as fast as machine X for benchmark 1, but half as fast for benchmark 2. Similarly machine Z is half as fast as X for benchmark 1, but twice as fast for benchmark 2. Intuitively, these three machines have equivalent performance. However, if we normalize to X and compute the arithmetic mean of the speed metric, we find that Y and Z are 25% faster than X. Now, if we normalize to Y and compute the arithmetic mean of the speed metric, we find that X is 25% faster than Y and Z is more than twice as fast as Y. Clearly, the arithmetic mean is worthless in this context. b. When the geometric mean is used, the three machines are shown to have equal performance when normalized to X, and also equal performance when normalized to Y. These results are much more in line with our intuition.Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallings2.16a. Assuming the same instruction mix means that the additional instructions for each task should be allocated proportionally among the instruction types. So we have the following table: Instruction Type Arithmetic and logic Load/store with cache hit Branch Memory reference with cache missCPI 1 2 4 12Instruction Mix 60% 18% 12% 10%CPI = 0.6 + (2  0.18) + (4  0.12) + (12  0.1) = 2.64. The CPI has increased due to the increased time for memory access. b. MIPS = 400/2.64 = 152. There is a corresponding drop in the MIPS rate. c. The speedup factor is the ratio of the execution times. Using Equation 2.2, we calculate the execution time as T = Ic/(MIPS  106). For the single-processor case, T1 = (2  106)/(178  106) = 11 ms. With 8 processors, each processor executes 1/8 of the 2 million instructions plus the 25,000 overhead instructions. For this case, the execution time for each of the 8 processors is2 10 6  0.025 10 6 1.8 ms T8  8  6 152 10Therefore we have Speedup time to execute program on a single processor11 time to execute program on N parallel processors  1.8  6.11d. The answer to this question depends on how we interpret Amdahl's' law. There are two inefficiencies in the parallel system. First, there are additional instructions added to coordinate between threads. Second, there is contention for memory access. The way that the problem is stated implies that none of the code is inherently serial. All of it is parallelizable, but with scheduling overhead. One could argue that the memory access conflict means that to some extent memory reference instructions are not parallelizable. But based on the information given, it is not clear how to quantify this effect in Amdahl's equation. If we assume that the fraction of code that is parallelizable is f = 1, then Amdahl's law reduces to Speedup = N =8 for this case. Thus the actual speedup is only about 75% of the theoretical speedup.Full file at http://testbank360.eu/solution-manual-computer-organization-and-architecture-9thedition-william-stallings2.17a. Speedup = (time to access in main memory)/(time to access in cache) = T2/T1.b. The average access time can be computed as T = H  T1 + (1 – H)  T2 Using Equation (2.8):Speedup =Execution time before enhancement T2 T2 1    Execution time after enhancement T H  T1  1 H T2 1 H  H T1   T2c. T = H  T1 + (1 – H)  (T1 + T2) = T1 + (1 – H)  T2) This is Equation (4.2) in Chapter 4. Now,Speedup =Execution time before enhancement T2 T2 1    Execution time after enhancement T T1  1 H T2 1 H  T1   T2In this case, the denominator is larger, so that the speedup is less.2.18 Tw = w/ = 8/18 = 0.44 hours

It's easier to figure out tough problems faster using Chegg Study. Unlike static PDF Computer Organization And Architecture 10th Edition solution manuals or printed answer keys, our experts show you how to solve each problem step-by-step. No need to wait for office hours or assignments to be graded to find out where you took a wrong turn.William Stallings Computer Organization

*JavaScript Not DetectedJavaScript is required to view textbook solutions.Computer Architecture And Organization Pdf

*Step 1 of 11

Given data:

Here, the aim is to write an IAS program to compute thesummation of n-natural numbers and save the result in a variablenamed Y.

• This notation is shown with the equation is given below:

Where,

o “”represents the summation.

o The value of X ranging from 1 to N.

o This is represented mathematically as .

• Computation does not result in arithmetic overflow.

• “X”, “Y”, and “N” values are positive integers where “N” valueis greater than or equal to 1.

*Step 2 of 11

a.

The IAS program for the equation is as shown below:

Location

Instruction/Value

Comments

0

< >

Constant (N) [initialized to some value]

1

1

Constant; Integer value = 1

2

2

Constant; Integer value = 2

3

0

Variable Y (initialized to integer zero);

Sum(Y)

4L

LOAD M(0)

N → AC

4R

ADD M(1)

AC + 1 → AC

5L

MUL M(0)

N(N+1) → AC

5R

DIV M(2)

AC/2 → AC

6L

STORE M(3)

AC → Y; saving the Sum in variable Y

6R

JUMP M(6,20:39)

Done; HALT

*Step 3 of 11

Explanation:

• An IAS instruction is a 40 bit instruction and it is dividedin to left portion with 20 bits and right portion with 20 bits.

• In the above mentioned code, initially a constant “N” is takenwith some value and it is in the location “0”.

• Later this constant is initialized with integer value, andlater with integer value “2”.

• Variable “Y” is initialized to integer value 0.

• The constant “N” is stored in an accumulator AC. That is, and this is stored in the left portion of the 4thlocation.

• Take an accumulator increment the value by “1” and store it inanother accumulator, i.e. this is stored in the right portion of the 4thlocation.

This process is as shown

Left Instruction

Right Instruction

LOAD M(0)

ADD M(1)

*Step 4 of 11

• Multiply this accumulator with constant N and store it inaccumulator again, That is, this is stored in the left portion of the 5thinstruction

• Divide the accumulator by “2” and save it in accumulator, thisis stored in the right portion of the 5th location.

• This process is as shown:

Left Instruction

Right Instruction

MUL M(0)

DIV M(2)

*Step 5 of 11

• Finally, save the accumulator in the variable “Y”.

Left Instruction

Right Instruction

STORE M(3)

AC → Y

JUMP M(6,20:39)

HALT

*Step 6 of 11b.

Location

Instruction / Value

Comments

0

< >

Constant (N) [initialized to some value]

1

1

Constant (loop counter increment)

2

1

Variable i (loop index value; current)

3

1

Variable Y = Sum of X values (Initialized to One)

4L

LOAD M(0)

N → AC (the max limit)

4R

SUB M(2)

Compute N–i → AC

5L

JUMP + M(6,0:19)

Check AC > 0 ? [i < N]

5R

JUMP + M(5,20:39)

i=N; done so HALT

6L

LOAD M(2)

i

6R

ADD M(1)

i+1 in AC

7L

STOR M(2)

AC → i

7R

ADD M(3)

i + Y in AC

8L

STOR M(3)

AC → Y

8R

JUMP M(4,0:19)

Continue at instruction located at address 4L

*Step 7 of 11

Explanation:

• Take a constant “N” that indicates the maximum number.

• Make the constant as a loop counter.

• Take a variable “i” that is the loop index

• Variable “Y” is set to be the sum of all “X” values initiallyset as “1”.

• Load the “N” value to the accumulator and keep it in the leftportion of the 4th instruction.

• Subtract the “i” value from the “N” and save it in accumulatoras shown:

Left Instruction

Right Instruction

LOAD M(0)

SUB M(2)

*Step 8 of 11

• If AC value is greater than “0” and “i” value is less than “N”jump to the left portion of the 6th instruction. If “N”value is equal to “i”, then HALT and jump to right portion of the5th instruction.

Note: In an instruction 0:19 indicate the left portion 20 bitsand 20:39 indicates right portion 20 bits.

Left Instruction

Right Instruction

JUMP + M(6,0:19)

AC > 0 ? [i < N]

JUMP + M(5,20:39)

i=N

*Step 9 of 11

• If “i” value is less than “N” load in to memory save it inleft portion of the 6th instruction, increment “i” valueby “1” and save it in accumulator save it in right portion of the6th instruction as shown below:

Left Instruction

Right Instruction

LOAD M(2)

ADD M(1)

*Step 10 of 11

• Assign accumulator value to “i” and save it in left portion of7th instruction.

• Add variables “i” and “Y” and save it in accumulator.

Left Instruction

Right Instruction

STOR M(2)

AC → i

ADD M(3)

*Step 11 of 11

• Assign the value of accumulator to variable “Y”, jump to theleft portion of the 4th instruction in the same mannercontinue until the end of the loop.

Left Instruction

Right Instruction

STOR M(3)

AC → Y

JUMP M(4,0:19)

LOOPTO 4L

 

 

 

 

broken image