I have the following structure of my test application:
Entity:
@Data
@Entity
@Table(name = "content")
public class Content {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private int id;
private String text;
}
Repository:
@Repository
public class ContentRepositoryCustomImpl implements ContentRepositoryCustom {
@PersistenceContext
private EntityManager entityManager;
@Override
public void updateLastChar(int batchSize, String lastChar) {
TypedQuery<Long> countQuery = entityManager.createQuery("Select count(c.id) from Content c", Long.class);
long count = countQuery.getSingleResult();
TypedQuery<Content> query = entityManager.createQuery("Select c From Content c", Content.class);
int pageNumber = 1;
while (pageNumber < count) {
query.setFirstResult(pageNumber - 1);
query.setMaxResults(batchSize);
List<Content> contents = query.getResultList();
for (Content content : contents) {
String temp = content.getText().substring(1);
content.setText(temp + lastChar);
}
pageNumber += batchSize;
}
}
}
Test:
@ExtendWith(SpringExtension.class)
@SpringBootTest
class ContentRepositoryTest {
@Autowired
ContentRepository contentRepository;
@Test
@Transactional
@Commit
public void testOOMError2() throws InterruptedException {
contentRepository.updateLastChar(30, "5");
}
}
application.yml:
spring:
datasource:
username: user
password: pass
url: jdbc:postgresql://localhost:5432/pg?currentSchema=myschema&characterEncoding=UTF-8&serverTimezone=UTC
driverClassName: org.postgresql.Driver
type: com.zaxxer.hikari.HikariDataSource
hikari:
maximumPoolSize: 20
jpa:
properties:
hibernate:
hbm2ddl:
auto: none
dialect: org.hibernate.dialect.PostgreSQL9Dialect
generate_statistics: true
hibernate:
naming:
physical-strategy: org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
DB:
CREATE TABLE etl.content
(
id integer NOT NULL DEFAULT nextval('etl.content_seq'::regclass),
text character varying(1048512) COLLATE pg_catalog."default",
CONSTRAINT content_pkey PRIMARY KEY (id)
)
In the DB I have 100 records. The 'text' column is full (1048512 chars) for every row.
The statistics which I get after the test can change from time to time.
When all records were updated:
Session Metrics {
1365887 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
9047519 nanoseconds spent preparing 104 JDBC statements;
10297157763 nanoseconds spent executing 104 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
875610249 nanoseconds spent executing 1 flushes (flushing a total of 100 entities and 0 collections);
8352008089 nanoseconds spent executing 5 partial-flushes (flushing a total of 180 entities and 180 collections)
When some records were updated (79 in the case below):
Session Metrics {
1566421 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
8680213 nanoseconds spent preparing 104 JDBC statements;
10392348834 nanoseconds spent executing 104 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
735904121 nanoseconds spent executing 1 flushes (flushing a total of 79 entities and 0 collections);
8773113348 nanoseconds spent executing 5 partial-flushes (flushing a total of 158 entities and 158 collections)
I try to figure out why it is happening. When I change the flush mode to COMMIT everything works:
spring.jpa.properties.org.hibernate.flushMode=ALWAYS
However, it would be nice to understand the cause of the problem.
UPDATE 1:
I removed
entityManager.persist(content);
as Simon Martinelli suggested, but it still does not work(69 entities were updated instead of 100):
Session Metrics {
11661208 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
18982347 nanoseconds spent preparing 105 JDBC statements;
8964494681 nanoseconds spent executing 105 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
774310803 nanoseconds spent executing 1 flushes (flushing a total of 69 entities and 0 collections);
7321581493 nanoseconds spent executing 5 partial-flushes (flushing a total of 138 entities and 138 collections)
UPDATE 2:
I added entityManager.flush():
public void updateLastChar(int batchSize, String lastChar) {
TypedQuery<Long> countQuery = entityManager.createQuery("Select count(c.id) from Content c", Long.class);
long count = countQuery.getSingleResult();
TypedQuery<Content> query = entityManager.createQuery("Select c From Content c", Content.class);
int pageNumber = 1;
while (pageNumber < count) {
query.setFirstResult(pageNumber - 1);
query.setMaxResults(batchSize);
List<Content> contents = query.getResultList();
for (Content content : contents) {
content.setText(lastChar);
}
entityManager.flush();
pageNumber += batchSize;
}
}
But the problem is still here. Moreover, I have less (65) than 105 statements (this behavior also occurs from time to time):
Session Metrics {
8190152 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
5825900 nanoseconds spent preparing 65 JDBC statements;
39315983 nanoseconds spent executing 65 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
57656774 nanoseconds spent executing 5 flushes (flushing a total of 249 entities and 0 collections);
3436917 nanoseconds spent executing 5 partial-flushes (flushing a total of 129 entities and 129 collections)
So after a couple of hours of suffering, I've finally found the reason: I don't use the ORDER BY clause and that`s why getResultList() sometimes returns duplicates.
The fix:
TypedQuery<Content> query = entityManager.createQuery("Select c From Content c ORDER BY c.id asc", Content.class);