Amateur programmer here benevolent ;
Recently i have been provided a challenge to aid in the refresh and expansion of my python programing skills towards app development and ML.
I have been provided android sensor .data files and in order to access them i must utilise the pandas library indeed, however i do not recall the precise part of the library to access and clean the data for EDA to proceed.
my colleague provided MATLAB script which i have zero experience with and was unable to translate or read(beyond just running it in matlab) to access the files, listed below.
function [d] = get_file_data(file_path)
disp(file_path)
file = dir(file_path);
fid = fopen(file_path, 'r');
subfolders = strsplit(file.folder, filesep);
subfolder = subfolders(end);
[start_time, start_nano_time] = get_filename_times(file.name);
d = [];
if file.bytes == 0
fprintf('warning: empty file: %s\n', file_path);
return
end
if strcmp(subfolder, "accelerometer") == 1
packets = floor(file.bytes / (8 3*4));
d = zeros(packets, 4);
for l = 1:packets
time = fread(fid, 1, 'int64', 'b');
xyz = fread(fid, 3, 'single', 'b');
d(l, 1) = start_time (time - start_nano_time)/1e6;
d(l, 2) = xyz(1);
d(l, 3) = xyz(2);
d(l, 4) = xyz(3);
end
elseif strcmp(subfolder, "gravity") == 1
packets = floor(file.bytes / (8 3*4));
d = zeros(packets, 4);
for l = 1:packets
time = fread(fid, 1, 'int64', 'b');
xyz = fread(fid, 3, 'single', 'b');
d(l, 1) = start_time (time - start_nano_time)/1e6;
d(l, 2) = xyz(1);
d(l, 3) = xyz(2);
d(l, 4) = xyz(3);
end
elseif strcmp(subfolder, "gyroscope") == 1
packets = floor(file.bytes / (8 3*4));
d = zeros(packets, 4);
for l = 1:packets
time = fread(fid, 1, 'int64', 'b');
xyz = fread(fid, 3, 'single', 'b');
d(l, 1) = start_time (time - start_nano_time)/1e6;
d(l, 2) = xyz(1);
d(l, 3) = xyz(2);
d(l, 4) = xyz(3);
end
elseif strcmp(subfolder, "light") == 1
packets = floor(file.bytes / (8 1*4));
d = zeros(packets, 2);
for l = 1:packets
time = fread(fid, 1, 'int64', 'b');
x = fread(fid, 1, 'single', 'b');
d(l, 1) = start_time (time - start_nano_time)/1e6;
d(l, 2) = x;
end
elseif strcmp(subfolder, "magnetic_field") == 1
packets = floor(file.bytes / (8 3*4));
d = zeros(packets, 4);
for l = 1:packets
time = fread(fid, 1, 'int64', 'b');
xyz = fread(fid, 3, 'single', 'b');
d(l, 1) = start_time (time - start_nano_time)/1e6;
d(l, 2) = xyz(1);
d(l, 3) = xyz(2);
d(l, 4) = xyz(3);
end
elseif strcmp(subfolder, "pressure") == 1
packets = floor(file.bytes / (8 1*4));
d = zeros(packets, 2);
for l = 1:packets
time = fread(fid, 1, 'int64', 'b');
x = fread(fid, 1, 'single', 'b');
d(l, 1) = start_time (time - start_nano_time)/1e6;
d(l, 2) = x;
end
elseif strcmp(subfolder, "proximity") == 1
packets = floor(file.bytes / (8 1*4));
d = zeros(packets, 2);
for l = 1:packets
time = fread(fid, 1, 'int64', 'b');
x = fread(fid, 1, 'single', 'b');
d(l, 1) = start_time (time - start_nano_time)/1e6;
d(l, 2) = x;
end
elseif strcmp(subfolder, "step_detector") == 1
packets = floor(file.bytes / 8);
time = fread(fid, packets, 'int64', 'b');
d = start_time (time - start_nano_time)/1e6;
end
fclose(fid);
d(:, 1) = d(:, 1)./1e3;
d = sortrows(d);
end
_is there a python analog which anyone could recommend?
Here is what I have found and attempted without success thus far:
https://www.mathworks.com/help/matlab/ref/fread.html
Code attempt below;
file = open("C:/Users/CSAS/Desktop/Files!/DS-Materials/sensor-data/c-39db1919-6fff-4229-9ee8-c7b13f882f59/sensor/accelerometer/1654736675164-3669968901397.dat", "rb")
file.close()
print(file)
data = pd.read_table('~/c-39db1919-6fff-4229-9ee8-c7b13f882f59/sensor/accelerometer/1654736675164-3669968901397.dat', encoding=('ISO-8859-1'), low_memory=False, error_bad_lines=False, lineterminator='\n')
data
which reads as this when printed, however any removal from the pr.read line ends in an error.
Unnamed: 0=\聟脜脌脵聢A聴0贸=R赂露脌脧潞A聴NaN1聜=W聼>脌脵聢A聰聻NaN2聣A聴NaN3聣A聰聻NaN4聣A聽脼NaN.........118聣A聤脨NaN119赂=K^矛脌枚茂A聤脨NaN120g^=R赂露脌脵聢A聫路卯=R赂露脌脧潞A聤脨121脴每贸=W聼>脌脧潞A聢]NaN122l=U 煤脌录 A聫路&=PEs脌茫UA聧D
123 rows 脳 2 columns
Is supposed to look like this with the correct syntax ofc:
ACCELEROMETER(Sensor, TYPE,_ACCELEROMETER, Sensor, STRING_TYPE_ACELEROMETER, 8 3*4),
AMBIENT_TEMPERATURE(Sensor, TYPE_AMBIENT_TEMPERATURE, Sensor, STRING_TYPE_AMBIENT_TEMPERATURE, 8 1*4).
GRAVITY(Sensor,TYPE_GRAVITY, Sensor, STRING_TYPE_GRAVITY, 8 3*4),
so further.. with the other sensors.
Here are a few links which have provided insight indeed, though for files not like this one;
https://stackoverflow.com/questions/16573089/reading-binary-data-into-pandas
Subreddit
Post Details
- Posted
- 2 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/programming...