Skip to main content

[Jacoco] Aggregating Jacoco Reports for Multi-Module Projects

Β· 2 min read

Overview​

Starting from Gradle 7.4, a feature has been added that allows you to aggregate multiple Jacoco test reports into a single, unified report. In the past, it was very difficult to view the test results across multiple modules in one file, but now it has become much more convenient to merge these reports.

Usage​

Creating a Submodule Solely for Collecting Reports​

The current project structure consists of a module named application and other modules like list and utils that are used by the application module.

By adding a code-coverage-report module, we can collect the test reports from the application, list, and utils modules.

The project structure will then look like this:

  • application
  • utils
  • list
  • code-coverage-report

Adding the jacoco-report-aggregation Plugin​

// code-coverage-report/build.gradle
plugins {
id 'base'
id 'jacoco-report-aggregation'
}

repositories {
mavenCentral()
}

dependencies {
jacocoAggregation project(":application")
}

Now, by running ./gradlew testCodeCoverageReport, you can generate a Jacoco report that aggregates the test results from all modules.

jacoco-directory

warning

To use the aggregation feature, a jar file is required. If you have set jar { enable = false }, you need to change it to true.

Update 22-09-28​

In the case of a Gradle multi-project setup, there is an issue where packages that were properly excluded in a single project are not excluded in the aggregate report.

By adding the following configuration, you can generate a report that excludes specific packages.

testCodeCoverageReport {
reports {
csv.required = true
xml.required = false
}
getClassDirectories().setFrom(files(
[project(':api'), project(':utils'), project(':core')].collect {
it.fileTree(dir: "${it.buildDir}/classes/java/main", exclude: [
'**/dto/**',
'**/config/**',
'**/output/**',
])
}
))
}

Next Step​

The jvm-test-suite plugin, which is introduced alongside jacoco-aggregation-report in Gradle, also seems very useful. Since these plugins are complementary, it would be beneficial to use them together.

Reference​

Why Docker?

Β· 5 min read
info

This article is written for internal information sharing and is explained based on a Java development environment.

What is Docker?​

info

A containerization technology that allows you to create and use Linux containers, and also the name of the largest company supporting this technology as well as the name of the open-source project.

deploy-history The image everyone has seen at least once when searching for Docker

Introduced in 2013, Docker has transformed the infrastructure world into a container-centric one. Many applications are now deployed using containers, with Dockerfiles created to build images and deploy containers, becoming a common development process. In the 2019 DockerCon presentation, it was reported that there were a staggering 105.2 billion container image pulls.

Using Docker allows you to handle containers like very lightweight modular virtual machines. Additionally, containers can be built, deployed, copied, and moved from one environment to another flexibly, supporting the optimization of applications for the cloud.

Benefits of Docker Containers​

Consistent Behavior Everywhere​

As long as the container runtime is installed, Docker containers guarantee the same behavior anywhere. For example, team member A using Windows OS and team member B using MacOS are working on different OSs, but by sharing the image through a Dockerfile, they can see the same results regardless of the OS. The same goes for deployment. If the container has been verified to work correctly, it will operate normally without additional configuration wherever it is run.

Modularity​

Docker's containerization approach focuses on the ability to decompose, update, or recover parts of an application without needing to break down the entire application. Users can share processes among multiple applications in a microservices-based approach, similar to how service-oriented architecture (SOA) operates.

Layering and Image Version Control​

Each Docker image file consists of a series of layers, which are combined into a single image.

Docker reuses these layers when building new containers, making the build process much faster. Intermediate changes are shared between images, improving speed, scalability, and efficiency.

Rapid Deployment​

Docker-based containers can reduce deployment time to mere seconds. Since there is no need to boot the OS to add or move containers, deployment time is significantly reduced. Moreover, the fast deployment speed allows for cost-effective and easy creation and deletion of data generated by containers, without users needing to worry about whether it was done correctly.

In short, Docker technology emphasizes efficiency and offers a more granular and controllable microservices-based approach.

Rollback​

When deploying with Docker, images are used with tags. For example, if you deploy using version 1.2 of an image, and version 1.1 of the image is still in the repository, you can simply run the command without needing to prepare the jar file again.

docker run --name app image:1.2
docker stop app

## Run version 1.1
docker run --name app image:1.1

Comparing Before and After Using Docker​

Using Docker containers allows for much faster and more flexible deployment compared to traditional methods.

Deployment Without Docker Containers​

  1. Package the jar file to be deployed on the local machine.
  2. Transfer the jar file to the production server using file transfer protocols like scp.
  3. Write a service file using systemctl for status management.
  4. Run the application with systemctl start app.

If multiple apps are running on a single server, the complexity increases significantly in finding stopped apps. The process is similarly cumbersome when running multiple apps on multiple servers, requiring commands to be executed on each server, making it a tiring process.

Deployment With Docker Containers​

  1. Use a Dockerfile to create an image of the application. β†’ Build βš’οΈ
  2. Push the image to a repository like Dockerhub or Gitlab registry. β†’ Shipping🚒
  3. Run the application on the production server with docker run image.

You don't need to waste time on complex path settings and file transfer processes. Docker works in any environment, ensuring it runs anywhere and uses resources efficiently.

Docker is designed to manage single containers effectively. However, as you start using hundreds of containers and containerized apps, management and orchestration can become very challenging. To provide services like networking, security, and telemetry across all containers, you need to step back and group them. This is where Kubernetes1 comes into play.

When Should You Use It?​

Developers can find Docker extremely useful in almost any situation. In fact, Docker often proves superior to traditional methods in development, deployment, and operations, so Docker containers should always be a top consideration.

  1. When you need a development database like PostgreSQL on your local machine.
  2. When you want to test or quickly adopt new technologies.
  3. When you have software that is difficult to install or uninstall directly on your local machine (e.g., reinstalling Java on Windows can be a nightmare).
  4. When you want to run the latest deployment version from another team, like the front-end team, on your local machine.
  5. When you need to switch your production server from NCP to AWS.

Example​

A simple API server:

docker run --name rest-server -p 80:8080 songkg7/rest-server
# Using curl
curl http://localhost/ping

# Using httpie
http localhost/ping

Since port 80 is mapped to the container's port 8080, you can see that communication with the container works well.

Commonly Used Docker Run Options

--name : Assign a name to the container

-p : Publish a container's port(s) to the host

--rm : Automatically remove the container when it exits

-i : Interactive, keep STDIN open even if not attached

-t : Allocate a pseudo-TTY, creating an environment similar to a terminal

-v : Bind mount a volume

Conclusion​

Using Docker containers allows for convenient operations while solving issues that arise with traditional deployment methods. Next, we'll look into the Dockerfile, which creates an image of your application.

Reference​


Footnotes​

  1. Kubernetes ↩

Exploring Kubernetes

Β· 4 min read

What is Kubernetes?​

Kubernetes provides the following functionalities:

  • Service discovery and load balancing
  • Storage orchestration
  • Automated rollouts and rollbacks
  • Automated bin packing
  • Automated scaling
  • Secret and configuration management

For more detailed information, refer to the official documentation.

There are various ways to run Kubernetes, but the official site uses minikube for demonstration. This article focuses on utilizing Kubernetes using Docker Desktop. If you want to learn how to use minikube, refer to the official site.

Let's briefly touch on minikube.

Minikube​

Install​

brew install minikube

Usage​

The commands are intuitive and straightforward, requiring minimal explanation.

minikube start
minikube dashboard
minikube stop
# Clean up resources after use
minikube delete --all

Pros​

Minikube is suitable for development purposes as it does not require detailed configurations like setting up secrets.

Cons​

One major drawback is that sometimes the command to view the dashboard causes hang-ups. This issue is the primary reason why I am not using minikube while writing this article.

Docker Desktop​

Install​

Simply activate Kubernetes from the Docker Desktop menu.

enable

Dashboard​

The Kubernetes dashboard is not enabled by default. You can activate it using the following command:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml

Starting the Dashboard​

kubectl proxy

You can now access the dashboard via this link.

dashboard

To log in, you will need a token. Let's see how to create one.

Secrets​

First, create a kubernetes folder to store related files separately.

mkdir kubernetes && cd kubernetes
warning

Granting admin privileges to the dashboard account can pose security risks, so be cautious when using it in actual operations.

dashboard-adminuser.yaml​

apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
kubectl apply -f dashboard-adminuser.yaml

cluster-role-binding.yml​

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
kubectl apply -f cluster-role-binding.yaml

Create Token​

kubectl -n kubernetes-dashboard create token admin-user
eyJhbGciOiJSUzI1NiIsImtpZCI6IjVjQjhWQVdpeWdLTlJYeXVKSUpxZndQUkoxdzU3eXFvM2dtMHJQZGY4TUkifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjox7jU4NTA3NTY1LCJpYXQiOjE2NTg1MDM5NjUsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW4lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW55Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiZTRkODM5NjQtZWE2MC00ZWI0LTk1NDgtZjFjNWQ3YWM4ZGQ3In19LCJuYmYiOjE2NTg1MDM5NjUsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn1.RjoUaQnhTVKvzpAx_rToItI8HTZsr-6brMHWL63ca1_D4QIMCxU-zz7HFK04tCvOwyOTWw603XPDCv-ovjs1lM6A3tdgncqs8z1oTRamM4E-Sum8oi7cKnmVFSLjfLKqQxapBvZF5x-SxJ8Myla-izQxYkCtbWIlc6JfShxCSBJvfwSGW8c6kKdYdJv1QQdU1BfPY1sVz__cLNPA70_OpoosHevfVV86hsMvxCwVkNQHIpGlBX-NPog4nLY4gfuCMxKqjdVh8wLT7yS-E3sUJiXCcPJ2-BFSen4y-RIDbg18qbCtE3hQBr033Mfuly1Wc12UkU4bQeiF5SerODDn-g

Use the generated token to log in.

welcome-view Successful access!

Creating a Deployment​

Create a deployment using an image. For this article, a web server using golang has been prepared in advance.

kubectl create deployment rest-server --image=songkg7/rest-server

As soon as the command is executed successfully, you can easily monitor the changes on the dashboard.

create-deployment The dashboard updates immediately upon deployment creation.

However, let's also learn how to check this via the CLI (the root...!).

Checking Status​

kubectl get deployments

get-deployment

When a deployment is created, pods are also generated simultaneously.

kubectl get pods -o wide

get-pods

Having confirmed that everything is running smoothly, let's send a request to our web server. Instead of using curl, we will use httpie1. If you are more comfortable with curl, feel free to use it.

http localhost:8080/ping

error

Even though everything seems to be working fine, why can't we receive a response? πŸ€”

This is because our service is not exposed to the outside world yet. By default, Kubernetes pods can only communicate internally. Let's make our service accessible externally.

Exposing the Service​

kubectl expose deployment rest-server --type=LoadBalancer --port=8080

Since our service uses port 8080, we open this port. Using a different port may result in connection issues.

Now, try sending the request again.

http localhost:8080/ping

200

You can see that you receive a successful response.

Reference​


Footnotes​

  1. Elegant httpie ↩

[Java] Making First Collection More Collection-like - Iterable

Β· 2 min read

Overview​

// Java Collection that implements Iterable.
public interface Collection<E> extends Iterable<E>

First-class collections are a very useful way to handle objects. However, despite the name "first-class collection," it only holds Collection as a field and is not actually a Collection, so you cannot use the various methods provided by Collection. In this article, we introduce a way to make first-class collections more like a real Collection using Iterable.

Let's look at a simple example.

Example​

@Value
public class LottoNumber {
int value;

public static LottoNumber create(int value) {
return new LottoNumber(value);
}
}
public class LottoNumbers {

private final List<LottoNumber> lottoNumbers;

private LottoNumbers(List<LottoNumber> lottoNumbers) {
this.lottoNumbers = lottoNumbers;
}

public static LottoNumbers create(LottoNumber... numbers) {
return new LottoNumbers(List.of(numbers));
}

// Delegates isEmpty() method to use List's methods.
public boolean isEmpty() {
return lottoNumbers.isEmpty();
}
}

LottoNumbers is a first-class collection that holds LottoNumber as a list. To check if the list is empty, we have implemented isEmpty().

Let's write a simple test for isEmpty().

@Test
void isEmpty() {
LottoNumber lottoNumber = LottoNumber.create(7);
LottoNumbers lottoNumbers = LottoNumbers.create(lottoNumber);

assertThat(lottoNumbers.isEmpty()).isFalse();
}

It's not bad, but AssertJ provides various methods to test collections.

  • has..
  • contains...
  • isEmpty()

You cannot use these convenient assert methods with first-class collections because they do not have access to them due to not being a Collection.

More precisely, you cannot use them because you cannot iterate over the elements without iterator(). To use iterator(), you just need to implement Iterable.

The implementation is very simple.

public class LottoNumbers implements Iterable<LottoNumber> {

//...

@Override
public Iterator<LottoNumber> iterator() {
return lottoNumbers.iterator();
}
}

Since first-class collections already have Collection, you can simply return it just like you delegated isEmpty().

@Test
void isEmpty_iterable() {
LottoNumber lottoNumber = LottoNumber.create(7);
LottoNumbers lottoNumbers = LottoNumbers.create(lottoNumber);

assertThat(lottoNumbers).containsExactly(lottoNumber);
assertThat(lottoNumbers).isNotEmpty();
assertThat(lottoNumbers).hasSize(1);
}

Now you can use various test methods.

Not only in tests but also in functionality implementation, you can conveniently use it.

for (LottoNumber lottoNumber : lottoNumbers) {
System.out.println("lottoNumber: " + lottoNumber);
}

This is possible because the for loop uses iterator().

Conclusion​

By implementing Iterable, you can use much richer functionality. The implementation is not difficult, and it is close to extending functionality, so if you have a first-class collection, actively utilize Iterable.

Elegant HTTP CLI, HTTPie

Β· 2 min read

Overview​

CLI tool that can replace the curl command

If you are a developer who frequently uses Linux, you probably use the curl command often. It is an essential command for sending external API requests from a server, but it has the disadvantage of poor readability in the output. HTTPie is an interesting tool that can alleviate this drawback, so let's introduce it.

Install​

For Mac users, you can easily install it using brew.

brew install httpie

For CentOS, you can install it using yum.

yum install epel-release
yum install httpie

Usage​

First, here is how you would send a GET request using curl.

curl https://httpie.io/hello

curl-get

Now, let's compare it with HTTPie.

https httpie.io/hello

get

The readability is much better in every aspect of the command. The response and header values are included by default, so you can get various information at a glance without using separate commands.

Note that https and http are distinguished in the command.

http localhost:8080

You can send a POST request as described on the official website.

http -a USERNAME POST https://api.github.com/repos/httpie/httpie/issues/83/comments body='HTTPie is awesome! :heart:'

Various other features are explained on GitHub, so if you make good use of them, you can greatly improve productivity.

Reference​

The Truth and Misconceptions about Getters and Setters

Β· 3 min read

When you search for getter/setter on Google, you'll find a plethora of articles. Most of them explain the reasons for using getter/setter, often focusing on keywords like encapsulation and information hiding.

The common explanation is that by declaring field variables as private to prevent external access and only exposing them through getter/setter, encapsulation is achieved.

However, does using getter/setter truly encapsulate data?

In reality, getter/setter cannot achieve encapsulation at all. To achieve encapsulation, one should avoid using getters and setters. To understand this, it is necessary to have a clear understanding of encapsulation.

What is Encapsulation?​

Encapsulation in object-oriented programming has two aspects: bundling an object's attributes (data fields) and behaviors (methods) together and hiding some of the object's implementation details internally. - Wikipedia

Encapsulation means that the external entities should not have complete knowledge of an object's internal attributes.

Why Getters and Setters Fail to Encapsulate​

As we have learned, encapsulation dictates that external entities should not know the internal attributes of an object. However, getter/setter blatantly exposes the fact that a specific field exists to the outside world. Let's look at an example.

Example​

public class Student {

private String name;
private int age;

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

public int getAge() {
return age;
}

public void setAge(int age) {
this.age = age;
}

public String introduce() {
return String.format("My name is %s and I am %d years old.", name, age);
}
}
class StudentTest {

@Test
void student() {
Student student = new Student();
student.setName("John");
student.setAge(20);
String introduce = student.introduce();

assertThat(student.getName()).isEqualTo("John");
assertThat(student.getAge()).isEqualTo(20);
assertThat(introduce).isEqualTo("My name is John and I am 20 years old.");
}
}

From outside the Student class, it is evident that it has attributes named name and age. Can we consider this state as encapsulated?

If the age attribute were to be removed from Student, changes would need to be made everywhere getter/setter is used. This creates strong coupling.

True encapsulation means that modifications to an object's internal structure should not affect the external entities, except for the public interface.

Let's try to hide the internal implementation.

public class Student {

private String name;
private int age;

public Student(String name, int age) {
this.name = name;
this.age = age;
}

public String introduce() {
return String.format("My name is %s and I am %d years old.", name, age);
}
}
class StudentTest {

@Test
void student() {
Student student = new Student("John", 20);
String introduce = student.introduce();

assertThat(introduce).isEqualTo("My name is John and I am 20 years old.");
}
}

Now, the object does not expose its internal implementation through the public interface. It is not possible to know what data it holds, prevent it from being modified, and only communicate through messages.

Conclusion​

Encapsulation is a crucial topic in object-oriented design, emphasizing designs that are not dependent on external factors. Opinions vary on the level of encapsulation, with some advocating against using both getter and setter and others suggesting that using getter is acceptable.

Personally, I believe in avoiding getter usage as much as possible, but there are situations, especially in testing, where having getters or setters can make writing test code easier. Deciding on the level of encapsulation depends on the current situation and the purpose of the code being developed.

Good design always emerges through the process of trade-offs.

info

All example codes can be found on GitHub.

Resolving 'Cannot Find sitemap.xml' Issue

Β· One min read

I had registered the sitemap.xml for my blog to ensure it gets indexed by Google, but all I was getting was an error message saying 'sitemap not found'. Finally, I managed to resolve it, and I am sharing the method I used.

While this method may not solve every case, it seems worth a try.

Simply run the following command:

curl https://www.google.com/ping\?sitemap\={path to your submitted sitemap}

And then, when you check the search console again...!

sitemap-success Finally resolved after almost a month...😒

The sitemap is finally being recognized.

Hope this helps!

Reference​

[Spring Batch] KafkaItemReader

Β· 2 min read
info

I used Docker to install Kafka before writing this post, but that content is not covered here.

What is KafkaItemReader..?​

In Spring Batch, the KafkaItemReader is provided for processing data from Kafka topics.

Let's create a simple batch job.

Example​

First, add the necessary dependencies.

dependencies {
...
implementation 'org.springframework.boot:spring-boot-starter-batch'
implementation 'org.springframework.kafka:spring-kafka'
...
}

Configure Kafka settings in application.yml.

spring:
kafka:
bootstrap-servers:
- localhost:9092
consumer:
group-id: batch
@Slf4j
@Configuration
@RequiredArgsConstructor
public class KafkaSubscribeJobConfig {

private final JobBuilderFactory jobBuilderFactory;
private final StepBuilderFactory stepBuilderFactory;
private final KafkaProperties kafkaProperties;

@Bean
Job kafkaJob() {
return jobBuilderFactory.get("kafkaJob")
.incrementer(new RunIdIncrementer())
.start(step1())
.build();
}

@Bean
Step step1() {
return stepBuilderFactory.get("step1")
.<String, String>chunk(5)
.reader(kafkaItemReader())
.writer(items -> log.info("items: {}", items))
.build();
}

@Bean
KafkaItemReader<String, String> kafkaItemReader() {
Properties properties = new Properties();
properties.putAll(kafkaProperties.buildConsumerProperties());

return new KafkaItemReaderBuilder<String, String>()
.name("kafkaItemReader")
.topic("test") // 1.
.partitions(0) // 2.
.partitionOffsets(new HashMap<>()) // 3.
.consumerProperties(properties) // 4.
.build();
}
}
  1. Specify the topic from which to read the data.
  2. Specify the partition of the topic; multiple partitions can be specified.
  3. If no offset is specified in KafkaItemReader, it reads from offset 0. Providing an empty map reads from the last offset.
  4. Set the essential properties for execution.
tip

KafkaProperties provides various public interfaces to conveniently use Kafka in Spring.

Try it out​

Now, when you run the batch job, consumer groups are automatically created based on the information in application.yml, and the job starts subscribing to the topic.

Let's use the kafka console producer to add data from 1 to 10 to the test topic.

kafka-console-producer.sh --bootstrap-server localhost:9092 --topic test

produce-topic

You can see that the batch job is successfully subscribing to the topic.

subscribe-batch

Since we set the chunkSize to 5, the data is processed in batches of 5.

So far, we have looked at the basic usage of KafkaItemReader in Spring Batch. Next, let's see how to write test code.

Easily Perform Static Code Analysis with Qodana

Β· 2 min read

What is Qodana?​

Qodana is a code quality improvement tool provided by JetBrains. It is very easy to use, so I would like to introduce it briefly.

First, you need an environment with Docker installed.

docker run --rm -it -p 8080:8080 \
-v <source-directory>/:/data/project/ \
-v <output-directory>/:/data/results/ \
jetbrains/qodana-jvm --show-report

I am analyzing a Java application, so I used the jvm image. If you are using a different language, you can find the appropriate image on Qodana's website.

  • Replace <source-directory> with the path to the project you want to analyze.
  • Enter the path where the analysis results will be stored in <output-directory>. I will explain this further below.

To store the analysis results, I created a folder named qodana in the root directory.

mkdir ~/qodana
# Then replace <output-directory> with ~/qobana.

Now, execute the docker run ~ command written above and wait for a while to see the results as shown below.

I used a simple Java application for testing.

image

Now, if you access http://localhost:8080, you can see the code analysis results.

image1

If you have Docker installed, you can easily obtain the code analysis results of your current project.

Such analysis tools serve as a form of code review, reducing the reviewer's fatigue and allowing them to focus on more detailed reviews. Actively utilizing code quality management tools like this can lead to a very convenient development experience.

[Spring Batch] Implementing Custom Constraint Writer

Β· 4 min read

Situation πŸ§β€‹

Recently, I designed a batch process that uses Upsert in PostgreSQL for a specific logic. During implementation, due to a change in business requirements, I had to add a specific column to a composite unique condition.

The issue arose from the fact that the unique constraint of the composite unique column does not prevent duplicates with null values in a specific column.

Let's take a look at an example of the problematic situation.

create table student
(
id integer not null
constraint student_pk
primary key,
name varchar,
major varchar,
constraint student_unique
unique (name, major)
);
idnamemajor
1songkorean
2kimenglish
3parkmath
4kimNULL
5kimNULL

To avoid allowing null duplicates, the idea of inserting dummy data naturally came to mind, but I felt reluctant to store meaningless data in the database. Especially if the column where null occurs stores complex data like UUID, it would be very difficult to identify meaningless values buried among other values.

Although it may be a bit cumbersome, using a unique partial index allows us to disallow null values without inserting dummy data. I decided to pursue the most ideal solution, even if it is challenging.

Solution​

Partial Index​

CREATE UNIQUE INDEX stu_2col_uni_idx ON student (name, major)
WHERE major IS NOT NULL;

CREATE UNIQUE INDEX stu_1col_uni_idx ON student (name)
WHERE major IS NULL;

PostgreSQL provides the functionality of partial indexes.

Partial Index : A feature that creates an index only when certain conditions are met. It allows for efficient index creation and maintenance by narrowing the scope of the index.

When a value with only name is inserted, stu_1col_uni_idx allows only one row with the same name where major is null. By creating two complementary indexes, we can skillfully prevent duplicates with null values in a specific column.

duplicate error An error occurs when trying to store a value without major

However, when there are two unique constraints like this, since only one constraint check is allowed during Upsert execution, the batch did not run as intended.

After much deliberation, I decided to check if a specific value is missing before executing the SQL and then execute the SQL that meets the conditions.

Implementing SelectConstraintWriter​

public class SelectConstraintWriter extends JdbcBatchItemWriter<Student> {

@Setter
private String anotherSql;

@Override
public void write(List<? extends Student> items) {
if (items.isEmpty()) {
return;
}

List<? extends Student> existMajorStudents = items.stream()
.filter(student -> student.getMajor() != null)
.collect(toList());

List<? extends Student> nullMajorStudents = items.stream()
.filter(student -> student.getMajor() == null)
.collect(toList());

executeSql(existMajorStudents, sql);
executeSql(nullMajorStudents, anotherSql);
}

private void executeSql(List<? extends student> students, String sql) {
if (logger.isDebugEnabled()) {
logger.debug("Executing batch with " + students.size() + " items.");
}

int[] updateCounts;

if (usingNamedParameters) {
if (this.itemSqlParameterSourceProvider == null) {
updateCounts = namedParameterJdbcTemplate.batchUpdate(sql, students.toArray(new Map[students.size()]));
} else {
SqlParameterSource[] batchArgs = new SqlParameterSource[students.size()];
int i = 0;
for (student item : students) {
batchArgs[i++] = itemSqlParameterSourceProvider.createSqlParameterSource(item);
}
updateCounts = namedParameterJdbcTemplate.batchUpdate(sql, batchArgs);
}
} else {
updateCounts = namedParameterJdbcTemplate.getJdbcOperations().execute(sql,
(PreparedStatementCallback<int[]>) ps -> {
for (student item : students) {
itemPreparedStatementSetter.setValues(item, ps);
ps.addBatch();
}
return ps.executeBatch();
});
}

if (assertUpdates) {
for (int i = 0; i < updateCounts.length; i++) {
int value = updateCounts[i];
if (value == 0) {
throw new EmptyResultDataAccessException("Item " + i + " of " + updateCounts.length
+ " did not update any rows: [" + students.get(i) + "]", 1);
}
}
}
}
}

I implemented this by overriding the write method of the JdbcBatchItemWriter that was previously used. By checking the presence of major in the code and selecting and executing the appropriate SQL, we can ensure that the Upsert statement works correctly instead of encountering a duplicateKeyException.

Here is an example of usage:

@Bean
SelectConstraintWriter studentItemWriter() {
String sql1 =
"INSERT INTO student(id, name, major) "
+ "VALUES (nextval('hibernate_sequence'), :name, :major) "
+ "ON CONFLICT (name, major) WHERE major IS NOT NULL "
+ "DO UPDATE "
+ "SET name = :name, "
+ " major = :major";

String sql2 =
"INSERT INTO student(id, name, major) "
+ "VALUES (nextval('hibernate_sequence'), :name, :major) "
+ "ON CONFLICT (name) WHERE major IS NULL "
+ "DO UPDATE "
+ "SET name = :name, "
+ " major = :major";

SelectConstraintWriter writer = new SelectConstraintWriter();
writer.setSql(sql1);
writer.setAnotherSql(sql2);
writer.setDataSource(dataSource);
writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>());
writer.afterPropertiesSet();
return writer;
}

Conclusion​

It's regrettable that if PostgreSQL allowed multiple constraint checks during Upsert execution, we wouldn't have needed to go to such lengths. I hope for updates in future versions.


Reference​

create unique constraint with null columns