Skip to main content

What Does It Mean to Write Well? - Writing Pipeline

· 7 min read

I mostly use the Markdown editor Obsidian for writing, and host my blog on GitHub pages. To maintain the habit of writing without interruption across these two different platforms, I'll share how I go about it.

info

This post was inspired by a presentation by Sungyun from 글또(geultto).

Gathering Material

In various situations like work, side projects, or studying, I often come across topics I don't know much about. Each time this happens, I create a new note immediately1. In this note, I write a brief summary of 1-2 lines focusing on the keywords I didn't know well.

I don't try to organize in detail from the beginning. I'm not familiar with the topic yet, so it can be tiring. Moreover, the newly learned information may not be immediately important. However, to prevent creating notes on the same topic later, I pay attention to the note title or tag for easy searching.

The key point is that this process is ongoing. If similar notes on the topic already exist, they will be enriched. Through repetition, eventually, a good post will emerge.

The initially created notes are stored in a directory named "inbox."

Learning and Organizing

The notes pile up in the inbox... I need to clear them out, right?

Once I find material that is useful and easy to organize, I study the topic and write a rough draft. Up to this point, I write not for blog purposes but for my own learning. Since it's just a simple memo, the writing style and expression are somewhat flexible. I might add some humor to make the structure of the post more interesting...

After writing this draft, I evaluate it to determine if it's suitable for posting on the blog. If the topic has been overly covered in other communities or blogs, I tend not to post it separately for the sake of differentiation.

info

However, for content related to personal experiences, such as introducing solutions to problems or sharing personal experiences, even if similar posts exist on other blogs, I try to write them as my feelings and perspectives may differ.

The organized posts are moved to the backlog directory.

Selecting Blog Posts

While not as many as in the inbox, a certain number of somewhat completed posts accumulate in the backlog. Around 10 posts linger there like a buffer. Over time, if my thoughts on the content change, requiring edits, or if incorrect information is found necessitating further study, some posts are demoted back to the inbox. This serves as a minimal verification process I can personally do to prevent spreading incorrect information. The surviving posts, after enduring all the hardships, are refined from personal learning posts to posts meant for others to see.

Once a post is satisfactory, it is moved to the directory "ready" for blog publication.

Uploading

Once all preparations for uploading are complete, I use O2 to convert the notes in "ready" to Markdown format and move them to the Jekyll project folder.

info

O2 is a community plugin for Obsidian that converts notes written in Obsidian to Markdown format.

gif You can see how the image links are automatically converted.

The notes in "ready" are copied to the published directory before being moved to the Jekyll project, where they are stored for backup. All Obsidian-specific syntax is converted to basic Markdown, and if there are attachments, they are copied to the Jekyll project folder along with the notes. Although the attachment file paths change, causing Markdown links that worked in Obsidian to break, there's no need to worry as all this is automated by O22. 😄 3

Now, I switch tools from Obsidian to VScode. Managing a Jekyll blog sometimes requires dealing with code. This goes beyond what a simple Markdown editor can handle, so continuing to work in Obsidian may present some challenges.

I briefly review the syntax and context, then run npm run publish to complete the blog post publication process.

info

You can learn more about publishing in this post.

Proofreading

I regularly review the posts to catch any unnoticed grammatical errors or awkward expressions and refine them gradually. This process doesn't have a set end point; just check the blog from time to time and make corrections consistently.

The blog post pipeline ends here, but I'll briefly explain how to utilize it to write better posts.

Optional. Data Analysis

Obsidian provides a graph view feature. By utilizing this feature, you can visualize how your notes are organically connected and use it for data analysis.

image In the graph, only the bright green nodes are posts published on the blog.

Most notes are still on topics I'm studying or posts that didn't make the cut for blog publication. From this graph, you can infer the following:

  • Nodes in the center with many edges but not published as blog posts likely cover very common topics that I chose not to publish. Or maybe I was just lazy...
  • Nodes scattered on the outer edges without edges represent fragmented knowledge that I haven't delved deeply into yet. Since they are not linked to any topic, they need further study to connect them internally 😂. These nodes need to be linked internally by learning more about related topics.
  • Posts on the outermost edges that have been published represent impulsively published posts during the process of acquiring new knowledge. Since they were impulsively published, it's important to periodically review them for any errors in the content.

Based on this objective data, I strive to expand my knowledge by checking how much I know and what I don't know regularly. 🧐

Conclusion

In this post, I introduced my writing pipeline and how I use Obsidian for data analysis to accurately understand myself. I hope writing becomes a natural part of your daily life, not just a task!

  • Quickly jot down emerging ideas to enable writing without losing the context of your work, making it possible to write consistently. Choose and utilize the appropriate tools according to the situation.
  • To make writing feel less like a chore, it's more efficient to add a few minutes consistently each day rather than holding onto writing for hours from scratch.
  • Blog publication can be a tedious task, so automate it as much as possible to keep the workflow simple. Focus on writing!
  • Assess how much you know and what you don't know (self-objectification). This helps greatly in selecting blog post topics and determining the direction of your learning.
info

All my writing, including drafts that are not posted on the blog, is managed publicly on GitHub.


Footnotes

  1. I've always kept notes nearby since I studied music in the past. It seemed like the best time to come up with something was when I was about to fall asleep. It doesn't seem much different now. Bugs solutions always seem to come to mind just before falling asleep...

  2. O2 Plugin Development Story

  3. Although new bug issues are added with each post... 😭 ???: That's a feature

Getting the Most Out of chezmoi

· 5 min read

Following on from the previous post, I'll share some ways to make better use of chezmoi.

info

You can check out the settings I'm currently using here.

How to Use It

You can find the usage of chezmoi commands with chezmoi help and in the official documentation. In this post, I'll explain some advanced ways to use chezmoi more conveniently.

Settings

chezmoi uses the ~/.config/chezmoi/chezmoi.toml file for settings. If you need tool-specific settings, you can define them in this file. It supports not only toml but also yaml and json, so you can write in a format you are familiar with. Since the official documentation guides with toml, I'll also explain using toml as the default.

Setting Merge Tool and Default Editor

chezmoi uses vi as the default editor. Since I mainly use nvim, I'll show you how to modify it to use nvim as the default editor.

chezmoi edit-config
[edit]
command = "nvim"

[merge]
command = "nvim"
args = ["-d", "{% raw %}{{ .Destination }}{% endraw %}", "{% raw %}{{ .Source }}{% endraw %}", "{% raw %}{{ .Target }}{% endraw %}"]

If you use VScode, you can set it up like this:

[edit]
command = "code"
args = ["--wait"]

Managing gitconfig Using Templates

Sometimes you may need separate configurations rather than uniform settings. For example, you might need different gitconfig settings for work and personal environments. In such cases where only specific data needs to be separated while the rest remains similar, chezmoi allows you to control this through a method called templates, which inject environment variables.

First, create a gitconfig file:

mkdir ~/.config/git
touch ~/.config/git/config

Register gitconfig as a template to enable the use of variables:

chezmoi add --template ~/.config/git/config

Write the parts where data substitution is needed:

chezmoi edit ~/.config/git/config
[user]
name = {% raw %}{{ .name }}{% endraw %}
email = {% raw %}{{ .email }}{% endraw %}

These curly braces will be filled with variables defined in the local environment. You can check the default variable list with chezmoi data.

Write the variables in chezmoi.toml:

# Write local settings instead of `chezmoi edit-config`.
vi ~/.config/chezmoi/chezmoi.toml
[data]
name = "privateUser"
email = "private@gmail.com"

After writing all this, try using chezmoi apply -vn or chezmoi init -vn to see the template variables filled with data values in the config file that is generated.

Auto Commit and Push

Simply editing dotfiles with chezmoi edit does not automatically reflect changes to the git in the local repository.

# You have to do it manually.
chezmoi cd
git add .
git commit -m "update something"
git push

To automate this process, you need to add settings to chezmoi.toml.

# `~/.config/chezmoi/chezmoi.toml`
[git]
# autoAdd = true
autoCommit = true # add + commit
autoPush = true

However, if you automate the push as well, sensitive files could accidentally be uploaded to the remote repository. Therefore, personally, I recommend activating only the auto option until commit.

Managing Brew Packages

If you find a useful tool at work, don't forget to install it in your personal environment too. Let's manage it with chezmoi.

chezmoi cd
vi run_once_before_install-packages-darwin.sh.tmpl

The run_once_ is a script keyword used by chezmoi. It is used when you want to run a script only if it has not been executed before. By using the before_ keyword, you can run the script before creating dotfiles. The script written using these keywords is executed in two cases:

  • When it has never been executed before (initial setup)
  • When the script itself has been modified (update)

By scripting brew bundle using these keywords, you can have uniform brew packages across all environments. Here is the script I am using:

# Only run on MacOS
{% raw %}{{- if eq .chezmoi.os "darwin" -}}{% endraw %}
#!/bin/bash

PACKAGES=(
asdf
exa
ranger
chezmoi
difftastic
gnupg
fzf
gh
glab
htop
httpie
neovim
nmap
starship
daipeihust/tap/im-select
)

CASKS=(
alt-tab
shottr
raycast
docker
hammerspoon
hiddenbar
karabiner-elements
obsidian
notion
slack
stats
visual-studio-code
warp
wireshark
google-chrome
)

# Install Homebrew if not already installed
if test ! $(which brew); then
printf '\n\n\e[33mHomebrew not found. \e[0mInstalling Homebrew...'
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
else
printf '\n\n\e[0mHomebrew found. Continuing...'
fi

# Update homebrew packages
printf '\nInitiating Homebrew update...\n'
brew update

printf '\nInstalling packages...\n'
brew install ${PACKAGES[@]}

printf '\n\nRemoving out of date packages...\n'
brew cleanup

printf '\n\nInstalling cask apps...\n'
brew install --cask ${CASKS[@]}

{% raw %}{{ end -}}{% endraw %}

Even if you are not familiar with sh, it shouldn't be too difficult to understand. Define the PACKAGES list for packages installed with brew install and CASKS for applications installed with brew install --cask. The installation process will be carried out by the script.

Scripting is a relatively complex feature among the functionalities available in chezmoi. There are various ways to apply it, and the same function can be defined differently. For more detailed usage, refer to the official documentation.

Conclusion

In this post, I summarized useful chezmoi settings following the basic usage explained in the previous post. The usage of the script I introduced at the end may seem somewhat complex, contrary to the title of basic settings, but once applied, it can be very convenient to use.

Reference

Managing Dotfiles Conveniently with Chezmoi

· 7 min read

Have you ever felt overwhelmed at the thought of setting up your development environment again after getting a new MacBook? Or perhaps you found a fantastic tool during work, but felt too lazy to set it up again in your personal environment at home? Have you ever hesitated to push your configurations to GitHub due to security concerns?

If you've ever used multiple devices, you might have faced these dilemmas. How can you manage your configurations consistently across different platforms?

Problem

Configuration files like .zshrc for various software are scattered across different paths, including $HOME (root). However, setting up Git at the root for version control of these files can be daunting. The wide range of scanning involved can actually make managing the files more difficult.

Maintaining a consistent development environment across three devices – a MacBook for work, an iMac at home, and a personal MacBook – seemed practically impossible.

Modifying just one Vim shortcut at work and realizing you have to do the same on the other two devices after work... 😭

With the advent of the Apple Silicon era, the significant disparity between Intel Macs and the new devices made achieving a unified environment even more challenging. I had pondered over this issue for quite some time, as I often forgot to set up aliases frequently used at work on my home machine.

Some of the methods I tried to solve this problem briefly were:

  1. Centralizing dotfiles in a specific folder and managing them as a Git project

    1. The locations of dotfiles vary. In most cases, there are predefined locations even if they are not in the root.
    2. You cannot work directly in a folder with Git set up, and you still have to resort to copy-pasting on other devices.
  2. Symbolic link

    1. To set up on a new device, you need to recreate symbolic links for all files in the correct locations(...). If you have many files to manage, this can be a tiresome task.
    2. The usage is more complex than Git, requiring attention to various details.

In the end, I resorted to using the Git method but only for files not in the root (~/.ssh/config, ~/.config/nvim, etc.), partially giving up on files using the root as the location (~/.zshrc, ~/.gitconfig, etc.) until I discovered chezmoi!

Now, let me introduce you to chezmoi, which elegantly solves this challenging problem.

What is Chezmoi?

Manage your dotfiles across multiple diverse machines, securely. - chezmoi.io

Chezmoi is a tool that allows you to manage numerous dotfiles consistently across various environments and devices. As described in the official documentation, with just a few settings, you can ensure 'security'. You don't need to worry about where your dotfiles are or where they should be. You simply need to inform chezmoi of the dotfiles to manage.

Concept

How is this seemingly magical feat possible? 🤔

In essence, chezmoi stores dotfiles in ~/.local/share/chezmoi and when you run chezmoi apply, it checks the status of each dotfile, making minimal changes to ensure they match the desired state. For more detailed concepts, refer to the reference manual.

Let's now briefly explain how to use it.

Getting Started with Chezmoi

Once you have installed chezmoi (installation guide here), perform the initialization with the following command:

chezmoi init

This action creates a new Git repository in ~/.local/share/chezmoi (working directory) on your local device to store dotfiles. By default, chezmoi reflects modifications in the working directory on your local device.

If you want to manage your ~/.zshrc file through chezmoi, run the following command:

chezmoi add ~/.zshrc

You will see that the ~/.zshrc file has been copied to ~/.local/share/chezmoi/dot_zshrc.

To edit the ~/.zshrc file managed by chezmoi, use the following command:

chezmoi edit ~/.zshrc

This command opens ~/.local/share/chezmoi/dot_zshrc with $EDITOR for editing. Make some changes for testing and save.

info

If $EDITOR is not in the environment variables, it defaults to using vi.

To check what changes have been made in the working directory, use the following command:

chezmoi diff

If you want to apply the changes made by chezmoi to your local device, use the following command:

chezmoi apply -v

All chezmoi commands can use the -v (verbose) option. This option visually displays what is being applied to your local device, making it clear in the console. By using the -n (dry run) option, you can execute commands without applying them. Therefore, combining the -v and -n options allows you to preview what actions will be taken when running unfamiliar commands.

Now, let's access the source directory directly and push the contents of chezmoi to a remote repository. It is recommended to name the repository dotfiles, as I will explain later.

chezmoi cd
git add .
git commit -m "Initial commit"
git remote add origin https://github.com/$GITHUB_USERNAME/dotfiles.git
git push
tip

By writing the relevant settings in the chezmoi.toml file, you can automate the repository synchronization process for more convenient use.

To exit the chezmoi working directory, use the following command:

exit

Visualizing the process up to this point, it looks like this:

image

Using Chezmoi on Another Device

This is why we use chezmoi. Let's fetch the contents on the second device using chezmoi. I have used an SSH URL for this example. Assume that chezmoi is already installed on the second device.

chezmoi init git@github.com:$GITHUB_USERNAME/dotfiles.git

By initializing with a specific repository, chezmoi automatically checks for submodules or necessary external source files and generates the chezmoi config file based on the options.

Inspect what changes chezmoi will bring to the second device using the diff command we saw earlier.

chezmoi diff

If you are satisfied with applying all the changes, use the apply command we discussed earlier.

chezmoi apply -v

If you need to modify some files before applying locally, use edit.

chezmoi edit $FILE

Alternatively, you can use a merge tool to apply local changes as if you were using Git merge.

chezmoi merge $FILE
tip

Using chezmoi merge-all will perform a merge operation on all files that require merging.

You can perform all these steps at once with the following command:

chezmoi update -v

Visualizing this process, it looks like this:

image

You can also apply all the steps needed on the second device at initialization...! This feature can be incredibly useful if the second device is a newly purchased one.

chezmoi init --apply https://github.com/$GITHUB_USERNAME/dotfiles.git

I recommended naming the repository dotfiles earlier because if the repository is named dotfiles, you can use a shortened version of the previous command.

chezmoi init --apply $GITHUB_USERNAME

image

It's truly convenient...🥹 I believe it will be one of the best open-source tools discovered in 2023.

Conclusion

Chezmoi is impressively well-documented and actively developed. Developed in Golang, it feels quite fast 😄. With some knowledge of shell scripting, you can implement highly automated processes, creating an environment where you hardly need to intervene for settings across multiple devices.

In this article, I covered the basic usage of chezmoi. In the next article, we will delve into managing chezmoi configuration files and maintaining security.

info

If you are curious about my configurations, you can check them here.

Reference

Optimizing Pagination in Spring Batch with Composite Keys

· 7 min read

In this article, I will discuss the issues and solutions encountered when querying a table with millions of data using Spring Batch.

Environment

  • Spring Batch 5.0.1
  • PostgreSQL 11

Problem

While using JdbcPagingItemReader to query a large table, I noticed a significant slowdown in query performance over time and decided to investigate the code in detail.

Default Behavior

The following query is automatically generated and executed by the PagingQueryProvider:

SELECT *
FROM large_table
WHERE id > ?
ORDER BY id
LIMIT 1000;

In Spring Batch, when using JdbcPagingItemReader, instead of using an offset, it generates a where clause for pagination. This allows for fast retrieval of data even from tables with millions of records without any delays.

tip

Even with LIMIT, using OFFSET means reading all previous data again. Therefore, as the amount of data to be read increases, the performance degrades. For more information, refer to the article1.

Using Multiple Sorting Conditions

The problem arises when querying a table with composite keys. When a composite key consisting of 3 columns is used as the sort key, the generated query looks like this:

SELECT *
FROM large_table
WHERE ((create_at > ?) OR
(create_at = ? AND user_id > ?) OR
(create_at = ? AND user_id = ? AND content_no > ?))
ORDER BY create_at, user_id, content_no
LIMIT 1000;

However, queries with OR operations in the where clause do not utilize indexes effectively. OR operations require executing multiple conditions, making it difficult for the optimizer to make accurate decisions. When I examined the explain output, I found the following results:

Limit  (cost=0.56..1902.12 rows=1000 width=327) (actual time=29065.549..29070.808 rows=1000 loops=1)
-> Index Scan using th_large_table_pkey on large_table (cost=0.56..31990859.76 rows=16823528 width=327) (actual time=29065.547..29070.627 rows=1000 loops=1)
" Filter: ((""create_at"" > '2023-01-28 06:58:13'::create_at without time zone) OR ((""create_at"" = '2023-01-28 06:58:13'::create_at without time zone) AND ((user_id)::text > '441997000'::text)) OR ((""create_at"" = '2023-01-28 06:58:13'::create_at without time zone) AND ((user_id)::text = '441997000'::text) AND ((content_no)::text > '9070711'::text)))"
Rows Removed by Filter: 10000001
Planning Time: 0.152 ms
Execution Time: 29070.915 ms

With a query execution time close to 30 seconds, most of the data is discarded during filtering on the index, resulting in unnecessary time wastage.

Since PostgreSQL manages composite keys as tuples, writing queries using tuples allows for utilizing the advantages of Index scan even in complex where clauses.

SELECT *
FROM large_table
WHERE (create_at, user_id, content_no) > (?, ?, ?)
ORDER BY create_at, user_id, content_no
LIMIT 1000;
Limit  (cost=0.56..1196.69 rows=1000 width=327) (actual time=3.204..11.393 rows=1000 loops=1)
-> Index Scan using th_large_table_pkey on large_table (cost=0.56..20122898.60 rows=16823319 width=327) (actual time=3.202..11.297 rows=1000 loops=1)
" Index Cond: (ROW(""create_at"", (user_id)::text, (content_no)::text) > ROW('2023-01-28 06:58:13'::create_at without time zone, '441997000'::text, '9070711'::text))"
Planning Time: 0.276 ms
Execution Time: 11.475 ms

It can be observed that data is directly retrieved through the index without discarding any data through filtering.

Therefore, when the query executed by JdbcPagingItemReader uses tuples, it means that even when using composite keys as sort keys, processing can be done very quickly.

Let's dive into the code immediately.

Modifying PagingQueryProvider

Analysis

As mentioned earlier, the responsibility of generating queries lies with the PagingQueryProvider. Since I am using PostgreSQL, the PostgresPagingQueryProvider is selected and used.

image The generated query differs based on whether it includes a group by clause.

By examining SqlPagingQueryUtils's buildSortConditions, we can see how the problematic query is generated.

image

Within the nested for loop, we can see how the query is generated based on the sort key.

Customizing buildSortConditions

Having directly inspected the code responsible for query generation, I decided to modify this code to achieve the desired behavior. However, direct overriding of this code is not possible, so I created a new class called PostgresOptimizingQueryProvider and re-implemented the code within this class.

private String buildSortConditions(StringBuilder sql) {
Map<String, Order> sortKeys = getSortKeys();
sql.append("(");
sortKeys.keySet().forEach(key -> sql.append(key).append(", "));
sql.delete(sql.length() - 2, sql.length());
if (is(sortKeys, order -> order == Order.ASCENDING)) {
sql.append(") > (");
} else if (is(sortKeys, order -> order == Order.DESCENDING)) {
sql.append(") < (");
} else {
throw new IllegalStateException("Cannot mix ascending and descending sort keys"); // Limitation of tuples
}
sortKeys.keySet().forEach(key -> sql.append("?, "));
sql.delete(sql.length() - 2, sql.length());
sql.append(")");
return sql.toString();
}

Test Code

To ensure that the newly implemented section works correctly, I validated it through a test code.

@Test
@DisplayName("The Where clause generated instead of Offset is (create_at, user_id, content_no) > (?, ?, ?).")
void test() {
// given
PostgresOptimizingQueryProvider queryProvider = new PostgresOptimizingQueryProvider();
queryProvider.setSelectClause("*");
queryProvider.setFromClause("large_table");

Map<String, Order> parameterMap = new LinkedHashMap<>();
parameterMap.put("create_at", Order.ASCENDING);
parameterMap.put("user_id", Order.ASCENDING);
parameterMap.put("content_no", Order.ASCENDING);
queryProvider.setSortKeys(parameterMap);

// when
String firstQuery = queryProvider.generateFirstPageQuery(10);
String secondQuery = queryProvider.generateRemainingPagesQuery(10);

// then
assertThat(firstQuery).isEqualTo("SELECT * FROM large_table ORDER BY create_at ASC, user_id ASC, content_no ASC LIMIT 10");
assertThat(secondQuery).isEqualTo("SELECT * FROM large_table WHERE (create_at, user_id, content_no) > (?, ?, ?) ORDER BY create_at ASC, user_id ASC, content_no ASC LIMIT 10");
}

image

The successful execution confirms that it is working as intended, and I proceeded to run the batch.

image

Guy: "is it over?"

Boy: "Shut up, It'll happen again!"

-- Within the Webtoon Hive

However, the out of range error occurred, indicating that the query was not recognized as having changed.

image

It seems that the parameter injection part is not automatically recognized just because the query has changed, so let's debug again to find the parameter injection part.

JdbcOptimizedPagingItemReader

The parameter is directly created by JdbcPagingItemReader, and I found that the number of parameters to be injected into SQL is increased by iterating through getParameterList in JdbcPagingItemReader.

image

I thought I could just override this method, but unfortunately it is not possible because it is private. After much thought, I copied the entire JdbcPagingItemReader and modified only the getParameterList part.

The getParameterList method is overridden in JdbcOptimizedPagingItemReader as follows:

private List<Object> getParameterList(Map<String, Object> values, Map<String, Object> sortKeyValue) {
// ...
// Returns the parameters that need to be set in the where clause without increasing them.
return new ArrayList<>(sortKeyValue.values());
}

There is no need to add sortKeyValue, so it is directly added to parameterList and returned.

Now, let's run the batch again.

The first query is executed without requiring parameters,

2023-03-13T17:43:14.240+09:00 DEBUG 70125 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing SQL query [SELECT * FROM large_table ORDER BY create_at ASC, user_id ASC, content_no ASC LIMIT 2000]

The subsequent query execution receives parameters from the previous query.

2023-03-13T17:43:14.253+09:00 DEBUG 70125 --- [           main] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement [SELECT * FROM large_table WHERE (create_at, user_id, content_no) > (?, ?, ?) ORDER BY create_at ASC, user_id ASC, content_no ASC LIMIT 2000]

The queries are executed exactly as intended! 🎉

For pagination processing with over 10 million records, queries that used to take around 30 seconds now run in the range of 0.1 seconds, representing a significant performance improvement of nearly 300 times.

image

Now, regardless of the amount of data, queries can be read within milliseconds without worrying about performance degradation. 😎

Conclusion

In this article, I introduced the method used to optimize Spring Batch in an environment with composite keys. However, there is a drawback to this method: all columns that make up the composite key must have the same sorting condition. If desc or asc are mixed within the index condition generated by the composite key, a separate index must be used to resolve this issue 😢

Let's summarize today's content in one line and conclude the article.

"Avoid using composite keys as much as possible and use surrogate keys unrelated to the business."

Reference


Footnotes

  1. https://jojoldu.tistory.com/528

Improving Code Productivity with Design Patterns in O2

· 7 min read

This article discusses the process of using design patterns to improve the structure of the O2 project for more flexible management.

Problem

While diligently working on development, a sudden Issue was raised one day.

image

Reflecting the contents of the Issue was not difficult. However, as I delved into the code, an issue that had been put off for a while started to surface.

image

Below is the implementation of the Markdown syntax conversion code that had been previously written.

warning

Due to the length of the code, a partial excerpt is provided. For the full code, please refer to O2 plugin v1.1.1 🙏

export async function convertToChirpy(plugin: O2Plugin) {
try {
await backupOriginalNotes(plugin);
const markdownFiles = await renameMarkdownFile(plugin);
for (const file of markdownFiles) {
// remove double square brackets
const title = file.name.replace('.md', '').replace(/\s/g, '-');
const contents = removeSquareBrackets(await plugin.app.vault.read(file));
// change resource link to jekyll link
const resourceConvertedContents = convertResourceLink(plugin, title, contents);

// callout
const result = convertCalloutSyntaxToChirpy(resourceConvertedContents);

await plugin.app.vault.modify(file, result);
}

await moveFilesToChirpy(plugin);
new Notice('Chirpy conversion complete.');
} catch (e) {
console.error(e);
new Notice('Chirpy conversion failed.');
}
}

Being unfamiliar with TypeScript and Obsidian usage, I had focused more on implementing features rather than the overall design. Now, trying to add new features, it became difficult to anticipate side effects, and the code implementation lacked clear communication of the developer's intent.

To better understand the code flow, I created a graph of the current process as follows.

Although I had separated functionalities into functions, the code was still procedurally written, where the order of code lines significantly impacted the overall operation. Adding a new feature in this state would require precise implementation to avoid breaking the entire conversion process. So, where would be the correct place to implement a new feature? Most likely, the answer would be 'I need to see the code.' Currently, with most of the code written in one large file, it was almost equivalent to needing to analyze the entire code. In object-oriented terms, one could say that the Single Responsibility Principle (SRP) was not being properly followed.

This state, no matter how positively described, did not seem easy to maintain. Since the O2 plugin was created for my personal use, I could not justify producing spaghetti code that was difficult to maintain by rationalizing, 'It's because I'm not familiar with TS.'

Before resolving the Issue, I decided to first improve the structure, putting the glory aside for a moment.

What Structure Should Be Implemented?

The O2 plugin, as a syntax conversion plugin, must be capable of converting Obsidian's Markdown syntax into various formats, which is a clear requirement.

Therefore, the design should focus primarily on scalability.

Each platform logic should be modularized, and the conversion process abstracted to implement it like a template. This way, when implementing new features for supporting different platform syntaxes, developers can focus only on the small unit of implementing syntax conversion without needing to reimplement the entire flow.

Based on this, the design requirements are as follows:

  1. Strings (content of Markdown files) should be converted in order (or not) as needed.
  2. Specific conversion logic should be skippable or dynamically controllable based on external settings.
  3. Implementing new features should be simple and should have minimal or no impact on existing code.

As there is a sequence of execution, and the ability to add features in between, the Chain of Responsibility pattern seemed appropriate for this purpose.

Applying Design Patterns

Process->Process->Process->Done! : Summary of Chain of Responsibility

export interface Converter {
setNext(next: Converter): Converter;
convert(input: string): string;
}

export abstract class AbstractConverter implements Converter {
private next: Converter;

setNext(next: Converter): Converter {
this.next = next;
return next;
}

convert(input: string): string {
if (this.next) {
return this.next.convert(input);
}
return input;
}
}

The Converter interface plays a role in converting specific strings through convert(input). By specifying the next Converter to be processed with setNext, and returning the Converter again, method chaining can be used.

With the abstraction in place, the conversion logic that was previously implemented in one file was separated into individual Converter implementations, assigning responsibility for each feature. Below is an example of the CalloutConverter that separates the Callout syntax conversion logic.

export class CalloutConverter extends AbstractConverter {
convert(input: string): string {
const result = convertCalloutSyntaxToChirpy(input);
return super.convert(result);
}
}

function convertCalloutSyntaxToChirpy(content: string) {
function replacer(match: string, p1: string, p2: string) {
return `${p2}\n{: .prompt-${replaceKeyword(p1)}}`;
}

return content.replace(ObsidianRegex.CALLOUT, replacer);
}

Now, the relationships between the classes are as follows.

Now, by combining the smallest units of functionality implemented in each Converter, a chain is created to perform operations in sequence. This is why this pattern is called Chain of Responsibility.

export async function convertToChirpy(plugin: O2Plugin) {
// ...
// Create conversion chain
frontMatterConverter.setNext(bracketConverter)
.setNext(resourceLinkConverter)
.setNext(calloutConverter);

// Request the frontMatterConverter at the head to perform the conversion, and the connected converters will operate sequentially.
const result = frontMatterConverter.convert(await plugin.app.vault.read(file));
await plugin.app.vault.modify(file, result);
// ...
}

Now that the logic has been separated into appropriate responsibilities, reading the code has become much easier. When needing to add a new feature, only the necessary Converter needs to be implemented. Additionally, without needing to know how other Converters work, new features can be added through setNext. Each Converter operates independently, following the principle of encapsulation.

Finally, I checked if all tests passed and created a PR.

image

Next Step

Although the structure has improved significantly, there is one remaining drawback. In the structure linked through setNext, calling the Converter at the very front is necessary for proper operation. If a different Converter is called instead of the one at the front, the result may be different from the intended one. If, for example, a NewConverter is implemented before frontMatterConverter but frontMatterConverter.convert(input) is not modified, NewConverter will not be applied.

This is one of the aspects that developers need to pay attention to, as there is room for error, and it is one of the areas that needs improvement in the future. For instance, implementing a kind of Context to contain the Converters and executing the conversion process without directly calling the Converters could be a way to improve. This is something I plan to implement in the next version.


2023-03-12 Update

Thanks to PR, the same functionality was performed, but with a more flexible structure using composition instead of inheritance.

Conclusion

In this article, I described the process of redistributing roles and responsibilities through design patterns from a procedurally written, monolithic file to a more object-oriented and maintainable code.

info

The complete code can be found on GitHub.

Caution when setting sortKeys in JdbcItemReader

· 4 min read

I would like to share an issue I encountered while retrieving large amounts of data in PostgreSQL.

Problem

While using the Spring Batch JdbcPagingItemReader, I set the sortKeys as follows:

...
.selectClause("SELECT *")
.fromClause("FROM big_partitioned_table_" + yearMonth)
.sortKeys(Map.of(
"timestamp", Order.ASCENDING,
"mmsi", Order.ASCENDING,
"imo_no", Order.ASCENDING
)
)
...

Although the current table's index is set as a composite index with timestamp, mmsi, and imo_no, I expected an Index scan to occur during the retrieval. However, in reality, a Seq scan occurred. The target table contains around 200 million records, causing the batch process to show no signs of completion. Eventually, I had to forcibly shut down the batch. Why did a Seq scan occur even when querying with index conditions? 🤔

In PostgreSQL, Seq scans occur in the following cases:

  • When the optimizer determines that a Seq scan is faster due to the table having a small amount of data
  • When the data being queried is too large (more than 10% of the table), and the optimizer deems Index scan less efficient than Seq scan
    • In such cases, you can use limit to adjust the amount of data and execute an Index scan

In this case, since select * was used, there was a possibility of a Seq scan due to the large amount of data being queried. However, due to the chunk size, the query was performed with limit, so I thought that Index scan would occur continuously.

Debugging

To identify the exact cause, let's check the actual query being executed. By slightly modifying the YAML configuration, we can observe the queries executed by the JdbcPagingItemReader.

logging:
level.org.springframework.jdbc.core.JdbcTemplate: DEBUG

I reran the batch process to directly observe the queries.

SELECT * FROM big_partitioned_table_202301 ORDER BY imo_no ASC, mmsi ASC, timestamp ASC LIMIT 1000

The order of the order by clause seemed odd, so I ran it again.

SELECT * FROM big_partitioned_table_202301 ORDER BY timestamp ASC, mmsi ASC, imo_no ASC LIMIT 1000

It was evident that the order by condition was changing with each execution.

To ensure the correct order for an Index scan, the sorting conditions must be specified in the correct order. Passing a general Map to sortKeys does not guarantee the order, leading to the SQL query not executing as intended.

To maintain order, you can use a LinkedHashMap to create the sortKeys.

Map<String, Order> sortKeys = new LinkedHashMap<>();
sortKeys.put("timestamp", Order.ASCENDING);
sortKeys.put("mmsi", Order.ASCENDING);
sortKeys.put("imo_no", Order.ASCENDING);

After making this adjustment and rerunning the batch, we could confirm that the sorting conditions were specified in the correct order.

SELECT * FROM big_partitioned_table_202301 ORDER BY timestamp ASC, mmsi ASC, imo_no ASC LIMIT 1000

Conclusion

The issue of Seq scan occurring instead of an Index scan was not something that could be verified with the application's test code, so we were unaware of any potential bugs. It was only when we observed a significant slowdown in the batch process in the production environment that we realized something was amiss. During development, I had not anticipated that the order of sorting conditions could change due to the Map data structure.

Fortunately, if an Index scan does not occur due to the large amount of data being queried, the batch process would slow down significantly with the LIMIT query, making it easy to notice the issue. However, if the data volume was low and the execution speeds of Index scan and Seq scan were similar, it might have taken a while to notice the problem.

Further consideration is needed on how to anticipate and address this issue in advance. Since the order of the order by condition is often crucial, it is advisable to use LinkedHashMap over HashMap whenever possible.

[O2] Developing an Obsidian Plugin

· 7 min read

Overview

Obsidian provides a graph view through links between Markdown files, making it convenient to store and navigate information. However, to achieve this, Obsidian enforces its own unique syntax in addition to the original Markdown syntax. This can lead to areas of incompatibility when reading Markdown documents from Obsidian on other platforms.

Currently, I use a Jekyll blog for posting, so when I write in Obsidian, I have to manually adjust the syntax later for blog publishing. Specifically, the workflow involves:

  • Using [[]] for file links, which is Obsidian's unique syntax
  • Resetting attachment paths, including image files
  • Renaming title.md to yyyy-MM-dd-title.md
  • Callout syntax

image Double-dashed arrows crossing layer boundaries require manual intervention.

As I use both Obsidian and Jekyll concurrently, there was a need to automate this syntax conversion process and attachment copying process.

Since Obsidian allows for functionality extension through community plugins unlike Notion, I decided to try creating one myself. After reviewing the official documentation, I found that Obsidian guides plugin development based on NodeJS. While the language options were limited, I had an interest in TypeScript, so I set up a NodeJS/TS environment to study.

Implementation Process

Naming

I first tackled the most important part of development.

It didn't take as long as I thought, as I came up with the project name 'O2' based on a sudden idea while writing a description, 'convert Obsidian syntax to Jekyll,' for the plugin.

image

Preparation for Conversion

With a suitable name in place, the next step was to determine how to convert which files.

The workflow for blog posting is as follows:

  1. Write drafts in a folder named ready.
  2. Once the manuscript is complete, copy the files, including attachments, to the Jekyll project, appropriately converting Obsidian syntax to Jekyll syntax in the process.
  3. Move the manuscript from the ready folder to published to indicate that it has been published.

I decided to program this workflow as is. However, instead of editing the original files in a Jekyll project open in VScode, I opted to create and modify copies internally in the plugin workspace to prevent modification of the original files and convert them to Jekyll syntax.

To summarize this step briefly:

  1. Copy the manuscript A.md from /ready to /published without modifying /published/A.md.
  2. Convert the title and syntax of /ready/A.md.
  3. Move /ready/yyyy-MM-dd-A.md to the path for Jekyll publishing.

Let's start the implementation.

Copying Original Files

// Get only Markdown files in the ready folder
function getFilesInReady(plugin: O2Plugin): TFile[] {
return this.app.vault.getMarkdownFiles()
.filter((file: TFile) => file.path.startsWith(plugin.settings.readyDir))
}

// Copy files to the published folder
async function copyToPublishedDirectory(plugin: O2Plugin) {
const readyFiles = getFilesInReady.call(this, plugin)
readyFiles.forEach((file: TFile) => {
return this.app.vault.copy(file, file.path.replace(plugin.settings.readyDir, plugin.settings.publishedDir))
})
}

By fetching Markdown files inside the /ready folder and replacing file.path with publishedDir, copying can be done easily.

Copying Attachments and Resetting Paths

function convertResourceLink(plugin: O2Plugin, title: string, contents: string) {
const absolutePath = this.app.vault.adapter.getBasePath()
const resourcePath = `${plugin.settings.jekyllResourcePath}/${title}`
fs.mkdirSync(resourcePath, {recursive: true})

const relativeResourcePath = plugin.settings.jekyllRelativeResourcePath

// Copy resourceDir/image.png to assets/img/<title>/image.png before changing
extractImageName(contents)?.forEach((resourceName) => {
fs.copyFile(
`${absolutePath}/${plugin.settings.resourceDir}/${resourceName}`,
`${resourcePath}/${resourceName}`,
(err) => {
if (err) {
new Notice(err.message)
}
}
)
})
// Syntax conversion
return contents.replace(ObsidianRegex.IMAGE_LINK, `![image](/${relativeResourcePath}/${title}/$1)`)
}

Attachments require moving files outside the vault, which cannot be achieved using Obsidian's default APIs. Therefore, direct file system access using fs is necessary.

info

Direct file system access implies difficulty in mobile usage, so the Obsidian official documentation guides specifying isDesktopOnly as true in manifest.json in such cases.

Before moving Markdown files to the Jekyll project, the Obsidian image link syntax is parsed to identify image filenames, which are then moved to Jekyll's resource folder so that the Markdown default image links are converted correctly, allowing attachments to be found.

Callout Syntax Conversion

Obsidian callout

> [!NOTE] callout title
> callout contents

Supported keywords: tip, info, note, warning, danger, error, etc.

Jekyll chirpy callout

> callout contents
{: .promt-info}

Supported keywords: tip, info, warning, danger

As the syntax of the two differs, regular expressions are used to substitute this part, requiring implementation of a replacer.

export function convertCalloutSyntaxToChirpy(content: string) {
function replacer(match: string, p1: string, p2: string) {
if (p1.toLowerCase() === 'note') {
p1 = 'info'
}
if (p1.toLowerCase() === 'error') {
p1 = 'danger'
}
return `${p2}\n{: .prompt-${p1.toLowerCase()}}`
}

return content.replace(ObsidianRegex.CALLOUT, replacer)
}

Unsupported keywords in Jekyll are converted to other keywords with similar roles.

Moving Completed Files

The Jekyll-based blog I currently use has a specific path where posts need to be located for publishing. Since the Jekyll project location may vary per client, custom path handling is required. I decided to set this up through a settings tab and created an input form like the one below.

image

Once all conversions are done, moving the files to the _post path in Jekyll completes the conversion process.

async function moveFilesToChirpy(plugin: O2Plugin) {
// Absolute path is needed to move files outside the vault
const absolutePath = this.app.vault.adapter.getBasePath()
const sourceFolderPath = `${absolutePath}/${plugin.settings.readyDir}`
const targetFolderPath = plugin.settings.targetPath()

fs.readdir(sourceFolderPath, (err, files) => {
if (err) throw err

files.forEach((filename) => {
const sourceFilePath = path.join(sourceFolderPath, filename)
const targetFilePath = path.join(targetFolderPath, filename)

fs.rename(sourceFilePath, targetFilePath, (err) => {
if (err) {
console.error(err)
new Notice(err.message)
throw err
}
})
})
})
}

Regular Expressions

export namespace ObsidianRegex {
export const IMAGE_LINK = /!\[\[(.*?)]]/g
export const DOCUMENT_LINK = /(?<!!)\[\[(.*?)]]/g
export const CALLOUT = /> \[!(NOTE|WARNING|ERROR|TIP|INFO|DANGER)].*?\n(>.*)/ig
}

Special syntax unique to Obsidian was handled using regular expressions for parsing. By using groups, specific parts could be extracted for conversion, making the process more convenient.

Creating a PR for Community Plugin Release

Finally, to register the plugin in the community plugin repository, I conclude by creating a PR. It is essential to adhere to community guidelines; otherwise, the PR may be rejected. Obsidian provides guidance on what to be mindful of when developing plugins, so it's crucial to follow these guidelines as closely as possible.

image

Based on previous PRs, it seems that merging takes approximately 2-4 weeks. If feedback is received later, I will make the necessary adjustments and patiently wait for the merge.

Conclusion

I thought, 'This should be a quick job, maybe done in 3 days,' but trying to implement the plugin while traveling abroad ended up taking about a week, including creating the release PR 😂

image I wonder if Kent Beck and Erich Gamma, who developed JUnit, coded like this on a plane...

Switching to TypeScript from Java or Kotlin made things challenging, as I wasn't familiar with it, and I wasn't confident if the code I was writing was best practice. However, thanks to this, I delved into JS syntax like async-await in detail, adding another technology stack to my repertoire. It's a proud feeling. This also gave me a new topic to write about.

The best part is that there's almost no need for manual work in blog posting anymore! After converting the syntax with the plugin, I only need to do a spell check before pushing to GitHub. Of course, there are still many bugs...

Moving forward, I plan to continue studying TypeScript gradually to eliminate anti-patterns in the plugin and improve the design for cleaner modules.

If you're facing similar dilemmas, contributing to the project or collaborating in other ways to build it together would be great! You're welcome anytime 😄

info

You can check out the complete code on GitHub.

Next Steps 🤔

  • Fix minor bugs
  • Support footnote syntax
  • Support image resize syntax
  • Implement transaction handling for rollback in case of errors during conversion
  • Abstract processing for adding other modules

Release 🚀

After about 6 days of code review, the PR was merged. The plugin is now available for use in the Obsidian Community plugin repository. 🎉

image

Reference

Managing Google Kubernetes Engine through Local CLI

· 3 min read

Overview

While it is very convenient to be able to run kubectl through Google's Cloud Shell via the web from anywhere, there is a drawback of needing to go through the hassle of web access and authentication for simple query commands. This article shares a method for quickly managing Google Cloud Kubernetes using a local CLI.

Contents

Installing GCP CLI

First, you need to install the GCP CLI. Refer to the gcp-cli link to check for the appropriate operating system and install it.

Connection

Once the installation is complete, proceed with the authentication process using the following command:

gcloud init

You need to access the GCP Kubernetes Engine and fetch the connection information for the cluster.

GKE-connect

gke-cluster-connect-2

Copy the command for command-line access and execute it in the terminal.

gcloud container clusters get-credentials sv-dev-cluster --zone asia-northeast3-a --project {projectId}
Fetching cluster endpoint and auth data.
CRITICAL: ACTION REQUIRED: gke-gcloud-auth-plugin, which is needed for continued use of kubectl, was not found or is not executable. Install gke-gcloud-auth-plugin for use with kubectl by following https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
kubeconfig entry generated for sv-dev-cluster.

Plugin Installation

If the current Kubernetes version being used is below v1.26, you may encounter an error requesting the installation of gke-gcloud-auth-plugin. Install the plugin using the following command.

info

Prior to v1.26, client-specific code for managing authentication between the client and Google Kubernetes Engine was included in the existing versions of kubectl and custom Kubernetes clients. Starting from v1.26, this code is no longer included in the OSS kubectl. GKE users need to download and use a separate authentication plugin to generate GKE-specific tokens. The new binary, gke-gcloud-auth-plugin, extends the kubectl authentication for GKE using the Kubernetes Client-go user authentication information plugin mechanism. Since the plugin is already supported in kubectl, you can switch to this new mechanism before v1.26 is provided. - Google

gcloud components install gke-gcloud-auth-plugin
Your current Google Cloud CLI version is: 408.0.1
Installing components from version: 408.0.1

┌────────────────────────────────────────────┐
│ These components will be installed. │
├────────────────────────┬─────────┬─────────┤
│ Name │ Version │ Size │
├────────────────────────┼─────────┼─────────┤
│ gke-gcloud-auth-plugin │ 0.4.0 │ 7.1 MiB │
└────────────────────────┴─────────┴─────────┘

For the latest full release notes, please visit:
https://cloud.google.com/sdk/release_notes

Do you want to continue (Y/n)? y

╔════════════════════════════════════════════════════════════╗
╠═ Creating update staging area ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Installing: gke-gcloud-auth-plugin ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Installing: gke-gcloud-auth-plugin ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Creating backup and activating new installation ═╣
╚════════════════════════════════════════════════════════════╝

Performing post processing steps...done.

Update done!

re-run the connection command, and you should see that the cluster is connected without any error messages.

gcloud container clusters get-credentials sv-dev-cluster --zone asia-northeast3-a --project {projectId}
Fetching cluster endpoint and auth data.
kubeconfig entry generated for sv-dev-cluster.

Once the connection is successful, you will also notice changes in Docker Desktop. Specifically, new information will be displayed in the Kubernetes tab.

1.png

Afterwards, you can also directly check GKE resources locally using kubectl.

kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
my-application 1/1 1 1 20d

Conclusion

We have briefly explored efficient ways to manage GKE resources locally. Using kubectl locally enables extended features like autocomplete, making Kubernetes management much more convenient. If you are new to using GKE, I strongly recommend giving it a try.

Reference

k8s-plugin

Speeding up Test Execution, Spring Context Mocking

· 3 min read

Overview

Writing test code in every project has become a common practice. As projects grow, the number of tests inevitably increases, leading to longer overall test execution times. Particularly in projects based on the Spring framework, test execution can significantly slow down due to the loading of Spring Bean contexts. This article introduces methods to address this issue.

Write All Tests as Unit Tests

Tests need to be fast. The faster they are, the more frequently they can be run without hesitation. If running all tests once takes 10 minutes, it means feedback will only come after 10 minutes.

To achieve faster tests in Spring, it is essential to avoid using @SpringBootTest. Loading all Beans causes the time to load necessary Beans to be overwhelmingly longer than executing the code for testing business logic.

@SpringBootTest
class SpringApplicationTest {

@Test
void main() {
}
}

The above code is a basic test code for running a Spring application. All Beans configured by @SpringBootTest are loaded. How can we inject only the necessary Beans for testing?

Utilizing Annotations or Mockito

By using specific annotations, only the necessary Beans for related tests are automatically loaded. This way, instead of loading all Beans through Context loading, only the truly needed Beans are loaded, minimizing test execution time.

Let's briefly look at a few annotations.

  • @WebMvcTest: Loads only Web MVC related Beans.
  • @WebFluxTest: Loads only Web Flux related Beans. Allows the use of WebTestClient.
  • @DataJpaTest: Loads only JPA repository related Beans.
  • @WithMockUser: When using Spring Security, creates a fake User, skipping unnecessary authentication processes.

Additionally, by using Mockito, complex dependencies can be easily resolved to write tests. By appropriately utilizing these two concepts, most unit tests are not overly difficult.

warning

If excessive mocking is required, there is a high possibility that the dependency design is flawed. Be cautious not to overuse mocking.

What about SpringApplication?

For SpringApplication to run, SpringApplication.run() must be executed. Instead of inefficiently loading all Spring contexts to verify the execution of this method, we can mock the SpringApplication where context loading occurs and verify only if run() is called without using @SpringBootTest.

class DemoApplicationTests {  

@Test
void main() {
try (MockedStatic<SpringApplication> springApplication = mockStatic(SpringApplication.class)) {
when(SpringApplication.run(DemoApplication.class)).thenReturn(null);

DemoApplication.main(new String[]{});

springApplication.verify(
() -> SpringApplication.run(DemoApplication.class), only()
);
}
}
}

Conclusion

In Robert C. Martin's Clean Code, Chapter 9 discusses the 'FIRST principle'.

Reflecting on the first letter, F, for Fast, as mentioned in this article, we briefly introduced considerations on speed. Once again, emphasizing the importance of fast tests, we conclude with the quote:

Tests must be fast enough. - Robert C. Martin

Reference

Fixture Monkey 0.4.x

· 3 min read
warning

As of May 2024, this post is no longer valid. Instead, please refer to Making Testing Easy and Convenient with Fixture Monkey.

Overview

With the update to FixtureMonkey version 0.4.x, there have been significant changes in functionality. It has only been a month since the previous post1, and there have been many modifications (ㅠ) which was a bit overwhelming, but taking comfort in the active community, I am writing a new post reflecting the updated features.

Changes

LabMonkey

An experimental feature has been added as an instance.

LabMonkey inherits from FixtureMonkey and supports existing features while adding several new methods. The overall usage is similar, so it seems that using LabMonkey instead of FixtureMonkey would be appropriate. It is said that after version 1.0.0, the functionality of LabMonkey will be deprecated, and the same features will be provided by FixtureMonkey.

private final LabMonkey fixture = LabMonkey.create();

Change in Object Creation Method

The responsibility has shifted from ArbitraryGenerator to ArbitraryIntrospector.

Record Support

Now, you can also create Record through FixtureMonkey.

public record LottoRecord(int number) {}
class LottoRecordTest {

private final LabMonkey fixture = LabMonkey.labMonkeyBuilder()
.objectIntrospector(ConstructorPropertiesArbitraryIntrospector.INSTANCE)
.build();

@Test
void shouldBetween1to45() {
LottoRecord lottoRecord = fixture.giveMeOne(LottoRecord.class);

System.out.println("lottoRecord: " + lottoRecord);

assertThat(lottoRecord).isNotNull();
}
}
lottoRecord: LottoRecord[number=-6]

By using ConstructorPropertiesArbitraryIntrospector to create objects, you can create Record objects. In version 0.3.x, the ConstructorProperties annotation was required, but now you don't need to make changes to the production code, which is quite a significant change.

In addition, various Introspectors exist to support object creation in a way that matches the shape of the object.

Plugin

private final LabMonkey fixture = LabMonkey.labMonkeyBuilder()
.objectIntrospector(ConstructorPropertiesArbitraryIntrospector.INSTANCE)
.plugin(new JavaxValidationPlugin())
.build();

Through the fluent API plugin(), you can easily add plugins. By adding JavaxValidationPlugin, you can apply Java Bean Validation functionality to create objects.

It seems like a kind of decorator pattern, making it easier to develop and apply various third-party plugins.

public record LottoRecord(
@Min(1)
@Max(45)
int number
) {
public LottoRecord {
if (number < 1 || number > 45) {
throw new IllegalArgumentException("The lotto number must be between 1 and 45.");
}
}
}
@RepeatedTest(100)
void shouldBetween1to45() {
LottoRecord lottoRecord = fixture.giveMeOne(LottoRecord.class);

assertThat(lottoRecord.number()).isBetween(1, 45);
}

Conclusion

Most of the areas that were mentioned as lacking in the previous post have been improved, and I am very satisfied with using it. But somehow, the documentation2 seems a bit lacking compared to before...

Reference

Footnotes

  1. FixtureMonkey 0.3.0 - Object Creation Strategy

  2. FixtureMonkey