Search code examples
javaperformancehibernatespring-data-jpa

JPA + spring boot - findBy <field> extremely slow when Database is huge ( 20M lines Table )


I need your association about JPA spring data. I have data base using MySQL with 20GB of data ( 20 m lines ) . When I execute findByID ( String id ) - ( Not the unique identifier ). it takes more than 10 minutes...

what can be the performance issues ??

My object Entity :

@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private long rootId;

private String address;


private String check_in_time;

private String check_out_time;

@OneToMany(mappedBy = "data" , fetch = FetchType.LAZY)
private Set<DescriptionStruct> description_struct;

private String id;

@ElementCollection(fetch = FetchType.LAZY)
private Set<String> images;

private String kind;

private double latitude;

private double longitude;

private String name;

private String phone;

@Embedded
private Star_certificate star_certificate;

private String postal_code;


@AttributeOverrides({ @AttributeOverride(name = "id", column = @Column(name = "R_ID")),
        @AttributeOverride(name = "name", column = @Column(name = "R_NAME")) })
@Embedded
private Region region;

private int star_rating;

private String email;

private int semantic_version;

@ElementCollection(fetch = FetchType.LAZY)
private Set<String> serp_filters;

private boolean is_closed;

@Embedded
private MetapolicyStruct metapolicy_struct;

private String metapolicy_extra_info;

@Embedded
private Facts facts;

@ElementCollection(fetch = FetchType.LAZY)
private Set<String> payment_methods;

private String hotel_chain;

@OneToMany(mappedBy = "data", fetch = FetchType.LAZY)
private Set<AmenityGroup> amenity_groups;

@OneToMany(mappedBy = "data", fetch = FetchType.LAZY)
private Set<RoomGroup> room_groups;

@OneToMany(mappedBy = "data", fetch = FetchType.LAZY)
private Set<PolicyStruct> policy_struct;

My function :

@Autowired
private DataRepository dataRepository; 

public regionSearchData returnFullData(regionSearchResponseRH response) {
    
    regionSearchData data = new regionSearchData();

    
    response.getData().getHotels().forEach(H -> {

        Data hotelD = dataRepository.findById(H.getId());
        if (hotelD != null) {
            RatesSearch R = H.getRates().get(0);
            R.setAddress(hotelD.getAddress());
            R.setImages(hotelD.getImages());
            R.setStar_certificate(hotelD.getStar_certificate());
            R.setStar_rating(hotelD.getStar_rating());
            R.setName(hotelD.getName());
                
        }
        
        });
    
    data.setTotal_hotels(response.getData().getTotal_hotels());
    data.setHotels(response.getData().getHotels()); 

    return data;
    
    
    
}

Hibernate statistics :

16893585 nanoseconds spent preparing 42 JDBC statements;
347535215661 nanoseconds spent executing 42 JDBC statements;
656641754926 nanoseconds spent executing 82 JDBC statements;

If I remove the findBy it take 10 seconds...

thanks, Idan


Solution

  • Definitely there are more than one things you can try. Your code takes more than ten minutes may be because of many roundtrips it's doing to DB over network. Remember it's not only DB search that takes more time, it's the round trips from your application to DB over network that takes more time, i suspect that is the case here.

    The first thing you can do depending on what is your usecase is minimizing round trips to DB and calling findById () in a for loop is the worst thing to do. Below are my advice

    1. Adding index to the field by which you are searching
    2. Avoid calling findById () in a for loop, you can use some other repository methods or you can write findByIdsIn(list of ids), depending on the size of this for loop - either you can pass all ids at once or use batched approach like getting 20-30 entities at a time and doing your business logic.
    3. You have many tables joined to your entity using @oneToMany or @ElementCollection, you need to check if hibernate is executing N+1 Queries for that, if so, solve that first.