-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lenta grabber #9
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Основное: нет отправки XMPP серверу
flamenews/lenta-grabber/pom.xml
Outdated
<dependency> | ||
<groupId>org.apache.httpcomponents</groupId> | ||
<artifactId>httpclient</artifactId> | ||
<version>4.2.3</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Версии мы задаем в parent.pom
|
||
public class LentaGrabber { | ||
|
||
private static void addNewFile (String dirPath, News news) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Это функционал, который не относится напрямую к LentaGrabber. Давайте сделаем для него интерфейс и вынесем отдельно.
System.out.println("Directory created"); | ||
} | ||
catch(SecurityException e){ | ||
System.out.println(e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Давайте логировать не sout-ами, а Logger-ом
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- в этом случае дальнейшее исполнение вряд ли имеет смысл, поэтому лучше бросить RuntimeException
catch(SecurityException e){
throw new RuntimeException(e);
}
try { | ||
num = Integer.parseInt(name.substring(0, name.lastIndexOf("."))); | ||
} catch (NumberFormatException e) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Если мы здесь обломались, наверное что-то пошло не так и продолжать не стоит
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Я имел в виду, что если что-то (зачем-то) положил в директорию ../news/
что-то странное с дурацким именем, то он, конечно, человек нехороший, но ну и пусть. Но если не пусть, то подправлю.
writer.flush(); | ||
} | ||
catch (IOException e) { | ||
System.out.println(e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
То же, что и выше
public class LentaGrabber { | ||
|
||
private static void addNewFile (String dirPath, News news) { | ||
int curNameOfFile = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Между объявлением переменной и ее использованием как-то слишком много левого кода :)
public static void main(String[] args) throws IOException, JAXBException, InterruptedException { | ||
Item oldItem = null; | ||
for(;;) { | ||
List<News> news = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Зачем нужен этот список?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
может быть несколько новостей за последнюю минуту
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller(); | ||
RSS newrss = null; | ||
if (oldItem == null) { | ||
RSS rss = (RSS) jaxbUnmarshaller.unmarshal(xml); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Если нет oldItem мы просто забиваем и ничего не делаем?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
подразумевается что это происходит один раз на первой итерации
List<Item> newItems = newrss.getChannel().getItems(); | ||
int num = 0; | ||
for (int i = 0; i < newItems.size(); i++) { | ||
if (newItems.get(i).getPubDate().equals(oldItem.getPubDate())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
А pubDate точно уникальная?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
неизвестно
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
добавил еще проверку заголовка
System.out.println(newrss.getChannel().getItems().get(i).getTitle()); | ||
} | ||
} | ||
for (Item i : newrss.getChannel().getItems().subList(0, num)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Я не очень понимаю, зачем здесь нужен subList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
в newrss 200 последних новостей, это чтобы выбрать только не обработанные ранее
final String URL_LENTA = args[0]; // "https://lenta.ru/rss/news"; | ||
final String directory = args[1]; // "../news/"; | ||
|
||
List<News> news = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Почему мы не можем каждую новость обрабатывать без записи в список? Мы же просто записываем в него, а потом сразу читаем и чистим. Какой в этом смысл?
final String directory = args[1]; // "../news/"; | ||
|
||
List<News> news = new ArrayList<>(); | ||
StreamSource xml = new StreamSource(new StringReader(sendRequest(URL_LENTA))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Некоторые переменные все еще не final
RSS newrss = null; | ||
if (oldItem == null) { | ||
RSS rss = (RSS) jaxbUnmarshaller.unmarshal(xml); | ||
oldItem = rss.getChannel().getItems().get(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Это не круто, я не хочу ждать минуту при тестировании
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller(); | ||
Item oldItem = null; | ||
for(;;) { | ||
RSS newrss = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Зачем здесь объявлять переменную?
for (int i = 0; i < newItems.size(); i++) { | ||
if (newItems.get(i).getPubDate().equals(oldItem.getPubDate()) | ||
&& newItems.get(i).getTitle().equals(oldItem.getTitle())) { | ||
num = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Зачем нам вообще нужен num
? Почему нельзя в этом цикле просто вызывать addNewFile(directory, new News(...));
?
writer.flush(); | ||
if (num > 0) { | ||
log.info("news: " + num); | ||
for (int i = 0; i < num; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
этот цикл тогда не понадобится
String line = ""; | ||
while ((line = rd.readLine()) != null) { | ||
resp.append(line); | ||
for (Item i : newrss.getChannel().getItems().subList(0, num)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
И этот тоже
} | ||
} | ||
|
||
static String sendRequest(String url) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
У этого метода ровно одно использование, почему он не заинлайнен?
|
||
static String sendRequest(String url) throws IOException { | ||
HttpClient client = new DefaultHttpClient(); | ||
HttpGet request = new HttpGet(url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final?
No description provided.