[AArch64][jdk8] IO java workload has significant overhead without any clients.
Continue our experiments on AArch64. We saw strange thing on io workloads. They take awful count of time. I compare the same run on x86 and it is fine.
ReadWrite.java the simplest one thread java application (MyFile.txt is text 350 MB file)
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.FileReader;
import java.io.IOException;
public class ReadWrite {
.
public static void main(String[] args) {
try {
FileReader reader = new FileReader("MyFile.txt");
FileWriter writer = new FileWriter("MyFile2.txt", true);
BufferedReader bufferedReader = new BufferedReader(reader);
BufferedWriter bufferedWriter = new BufferedWriter(writer);
String line;
while ((line = bufferedReader.readLine()) != null) {
bufferedWriter.write(line);
bufferedWriter.newLine();
}
reader.close();
writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Original run without DynamoRIO took ~2 secs. x86 with DynamoRIO took about 3 secs BUT AArch64 with DRIO took 14 minutes (420x overhead) :((
opcodes statistics
Top 15 opcode execution counts in 64-bit AArch64 mode:
90371625 : movz
109692827 : cbz
119677065 : ldrb
141852747 : ldp
149865875 : stp
150764846 : sub
186207835 : ubfm
257289620 : str
311057147 : orr
372462682 : ldrsb
373357880 : strh
379415317 : strb
736110803 : ldrh
760824735 : ldr
2145521407 : subs
2182566199 : bcond
2473296771 : add
Also I've tried to check c++ native version on AArch64 and it looks fine
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main () {
string line;
ifstream myfile ("MyFile.txt");
ofstream myfile2 ("MyFile2.txt");
if (myfile.is_open() && myfile2.is_open())
{
while ( getline (myfile,line) )
{
myfile2 << line << '\n';
}
myfile.close();
myfile2.close();
}
else cout << "Unable to open file";
return 0;
}
opcodes statistics
Top 15 opcode execution counts in 64-bit AArch64 mode:
38881409 : sub
41657277 : bl
43253273 : cbz
44942292 : cbnz
48018122 : br
54446025 : ret
63293933 : ldur
64014752 : movz
64051241 : adrp
99217398 : stp
124038866 : bcond
127033323 : subs
127577743 : ldp
148219600 : str
211563739 : add
222393285 : orr
353538698 : ldr
Maybe you have ideas what could be wrong here? Why do we have such difference between x86 and AArch64 on such java workloads? Thanks, Kirill
