Sort a Julia 1.1 matrix by one of its columns, that contains strings
You want sortslices
, not sort
— the latter just sorts all columns independently, whereas the former rearranges whole slices. Secondly, the by
function doesn't take an index, it takes the value that is about to be compared (and allows you to transform it in some way). Thus:
julia> using Random
data = Union{Float64, String}[randn(100) [randstring(10) for _ in 1:100]]
100×2 Array{Union{Float64, String},2}:
0.211015 "6VPQbWU5f9"
-0.292298 "HgvHLkufqI"
1.74231 "zTCu1U5Vdl"
0.195822 "O3j43sbhKV"
⋮
-0.369007 "VzFH2OpWfU"
-1.30459 "6C68G64AWg"
-1.02434 "rldaQ3e0GE"
1.61653 "vjvn1SX3FW"
julia> sortslices(data, by=x->x[2], dims=1)
100×2 Array{Union{Float64, String},2}:
0.229143 "0syMQ7AFgQ"
-0.642065 "0wUew61bI5"
1.16888 "12PUn4V4gL"
-0.266574 "1Z2ONSBP04"
⋮
1.85761 "y2DDANcFCe"
1.53337 "yZju1uQqMM"
1.74231 "zTCu1U5Vdl"
0.974607 "zdiU0sVOZt"
Unfortunately we don't have an in-place sortslices!
yet, but you can easily construct a sorted view with sortperm
. This probably won't be as fast to use, but if you need the in-place-ness for semantic reasons it'll do just the trick.
julia> p = sortperm(data[:,2]);
julia> @view data[p, :]
100×2 view(::Array{Union{Float64, String},2}, [26, 45, 90, 87, 6, 96, 82, 75, 12, 27 … 53, 69, 100, 93, 36, 37, 39, 8, 3, 61], :) with eltype Union{Float64, String}:
0.229143 "0syMQ7AFgQ"
-0.642065 "0wUew61bI5"
1.16888 "12PUn4V4gL"
-0.266574 "1Z2ONSBP04"
⋮
1.85761 "y2DDANcFCe"
1.53337 "yZju1uQqMM"
1.74231 "zTCu1U5Vdl"
0.974607 "zdiU0sVOZt"
(If you want the in-place-ness for performance reasons, I'd recommend using a DataFrame or similar structure that holds its columns as independent homogenous vectors — a Union{Float64, String}
will be slower than two separate well-typed vectors, and sort!
ing a DataFrame works on whole rows like you want.)